PITS@Dravidian-CodeMix-FIRE2020: Traditional approach to noisy code-mixed sentiment analysis

Kanwar N.; Agarwal M.; Mundotiya R.K.

doi:DOI not available

PITS@Dravidian-CodeMix-FIRE2020: Traditional approach to noisy code-mixed sentiment analysis

Authors

Abstract

Sentiment Analysis (SA) is a process for characterizing the response or opinion by which sentiment polarity of the text is decided. Nowadays, social media is a common platform to convey opinions, suggestions and much more in a user’s native language or multilingual in Roman script (for ease). In this task, Malayalam-English and Tamil-English code mixed dataset in the Roman script has provided for SA. To solve this task, we have generated syntax-based features and used trained logistic regression with as an under-sampling technique. We have obtained best F1-score of 0.71 and 0.62 on the blind test set of Malayalam-English and Tamil-English code mixed datasets, respectively. The code is available at Github1 © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).