Sarcasm Detection in Tamil and Malayalam Dravidian Code-Mixed Text
Abstract
Sarcasm is a form of verbal irony that involves saying the opposite of what is actually meant in a mocking or humorous manner. You can find many sarcastic comments on social media these days, which are often code-mixed in nature. To gain insights from the textual data available to us, we need a system to detect sarcasm and identify the sentiments behind the texts. In this paper, we present a solution submitted for the shared task titled ‘Sarcasm Identification of Dravidian Languages Tamil and Malayalam,’ which was organized by Dravidian CodeMix 2023 at the Forum for Information Retrieval Evaluation (FIRE) 2023. This paper explores an approach to sarcasm detection, leveraging the BERT (Bidirectional Encoder Representations from Transformers) and a supplementary layer of neural networks for precise classification into two distinct classes: sarcastic and non-sarcastic comments. Our experiment demonstrates that our model effectively detects sarcastic comments, achieving an F1 score of 0.72 for both the Tamil-English and Malayalam-English code-mixed datasets. © 2022 Copyright for this paper by its authors.