Accelerating automatic hate speech detection using parallelized ensemble learning models

Agarwal S.; Sonawane A.; Chowdary C.R.

doi:https://doi.org/10.1016/j.eswa.2023.120564

Accelerating automatic hate speech detection using parallelized ensemble learning models

Authors

Abstract

With increasing number of social media users and online engagement, it is essential to study hate speech propagation on social media platforms (SMPs). Automatic hate speech detection on social media is of utmost importance as hate speech can create discomfort among users and potentially generate a strong reaction in society. Ensemble learning algorithms are helpful in addressing sentiment-based classification due to their fault tolerance and efficiency. However, a simple, scalable, and robust framework is required to deal with large-scale data efficiently and accurately. Therefore, we propose parallelization to the standard ensemble learning algorithms to speed up the automatic hate speech detection on SMPs. In this study, we parallelize bagging, A-stacking, and random sub-space algorithms and test their serial and parallel versions on the standard high-dimensional datasets for hate speech detection. The experiments are performed using six datasets that address hate speech propagation during events like the COVID-19 pandemic, the US presidential election (2020), and the farmers’ protest in India (2021). Our parallel models observe a significant speedup with high efficiency, claiming that the proposed models are suitable for the considered application. Also, one of the main motivations of this study is to highlight the importance of generalization by testing the models under the cross-dataset environment. We observed that the accuracy is not affected while parallelizing the algorithms compared with serial algorithms executing on a single machine. © 2023 Elsevier Ltd