A Comparative Study of Transformers on Word Sense Disambiguation
| dc.contributor.author | Chawla A.; Mulay N.; Bishnoi V.; Dhama G.; Singh A.K. | |
| dc.date.accessioned | 2025-05-23T11:27:28Z | |
| dc.description.abstract | Recent years of research in Natural Language Processing (NLP) have witnessed dramatic growth in training large models for generating context-aware language representations. In this regard, numerous NLP systems have leveraged the power of neural network-based architectures to incorporate sense information in embeddings, resulting in Contextualized Word Embeddings (CWEs). Despite this progress, the NLP community has not witnessed any significant work performing a comparative study on the contextualization power of such architectures. This paper presents a comparative study and an extensive analysis of nine widely adopted Transformer models. These models are BERT, CTRL, DistilBERT, OpenAI-GPT, OpenAI-GPT2, Transformer-XL, XLNet, ELECTRA, and ALBERT. We evaluate their contextualization power using two lexical sample Word Sense Disambiguation (WSD) tasks, SensEval-2 and SensEval-3. We adopt a simple yet effective approach to WSD that uses a k-Nearest Neighbor (kNN) classification on CWEs. Experimental results show that the proposed techniques also achieve superior results over the current state-of-the-art on both the WSD tasks. © 2021, Springer Nature Switzerland AG. | |
| dc.identifier.doi | https://doi.org/10.1007/978-3-030-92307-5_87 | |
| dc.identifier.uri | http://172.23.0.11:4000/handle/123456789/11450 | |
| dc.relation.ispartofseries | Communications in Computer and Information Science | |
| dc.title | A Comparative Study of Transformers on Word Sense Disambiguation |