Repository logo
Institutional Digital Repository
Shreenivas Deshpande Library, IIT (BHU), Varanasi

A language identification method applied to Twitter data

dc.contributor.authorSingh A.K.; Goyal P.
dc.date.accessioned2025-05-24T09:20:44Z
dc.description.abstractThis paper presents the results of some experiments on using a simple algorithm, aided by a few heuristics, for the purposes of language identification on Twitter data. These experiments were a part of a shared task focused on this problem. The core algorithm is an n-gram based distance metric algorithm. This algorithm has previously been shown to work very well on normal text. The distance metric used is symmetric cross entropy.
dc.identifier.doiDOI not available
dc.identifier.urihttp://172.23.0.11:4000/handle/123456789/14362
dc.relation.ispartofseriesCEUR Workshop Proceedings
dc.titleA language identification method applied to Twitter data

Files

Collections