IRLab@IITBHU@Dravidian-CodeMix-FIRE2020: Sentiment analysis for dravidian languages in code-mixed text
Abstract
This paper describes the IRlab@IITBHU system for the Dravidian-CodeMix - FIRE 2020: Sentiment Analysis for Dravidian Languages pairs Tamil-English (TA-EN) and Malayalam-English (ML-EN) in Code-Mixed text. We submitted three models for sentiment analysis of code-mixed TA-EN and MA-EN datasets. Run-1 was obtained from the BERT and Logistic regression classifier, Run-2 used the DistilBERT and Logistic regression classifier, and Run-3 used the fastText model for producing the results. Run-3 outperformed Run-1 and Run-2 for both the datasets. We obtained an F1-score of 0.58, rank 8/14 in TA-EN language pair and for ML-EN, an F1-score of 0.63 with rank 11/15. © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).