Sentiment analysis on multilingual code mixing text using BERT-BASE: participation of IRLab@IIT(BHU) in dravidian-CodeMix and HASOC tasks of FIRE2020
Abstract
This paper discusses our participation in the “Sentiment Analysis in Dravidian-CodeMix”, Dravidian-CodeMix and “Hate Speech and Offensive Content Identification in Indo-European Languages”- FIRE 2020 tasks of identifying subjective opinions or reactions on a given topic. Several techniques are applied for sentiment analysis including the recent word embeddings-based methods. BERT, Word2Vec, and ELMo are currently among the most promising and ready-to-use word embedding methods that can convert words into meaningful vectors. We used the BERT_BASE model for sentiment classification of Dravidian-CodeMix data and for HASOC task, our team submitted systems for all the two sub-tasks in three languages - Hindi, English, and German with BERT-based system. We report our approach and results which are promising. © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).