Hateful Sentiment Detection in Real-Time Tweets: An LSTM-Based Comparative Approach

被引:18
作者
Roy, Sanjiban Sekhar [1 ]
Roy, Akash [1 ]
Samui, Pijush [2 ]
Gandomi, Mostafa [3 ]
Gandomi, Amir H. [4 ,5 ]
机构
[1] Vellore Inst Technol, Sch Comp Sci & Engn, Vellore, India
[2] Natl Inst Technol Patna, Patna 800005, India
[3] Univ Tehran, Coll Engn, Tehran 1416634793, Iran
[4] Univ Technol Sydney, Fac Engn & Informat Syst, Ultimo, NSW 2007, Australia
[5] Obuda Univ, Univ Res & Innovat Ctr EKIK, H-1034 Budapest, Hungary
关键词
Social networking (online); Hate speech; Blogs; Feature extraction; Real-time systems; Support vector machines; Radio frequency; Machine learning (ML); social media; text classification; Twitter hate speech detection; CLASSIFICATION;
D O I
10.1109/TCSS.2023.3260217
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
It is undeniable that social media has improved our lives in many ways, like allowing interactions with others all over the world and network expansion for businesses. However, there are detrimental effects of such accessibility, including the rapid spread of hate through offensive messages typically directed toward gender, religion, race, and disability, which can cause psy-chological harm. To address this problem of social media, many researchers have recently proposed various algorithms powered by machine learning (ML) and deep learning for the detection of hate speech. This work proposes a hate speech detection model based on long-short term memory (LSTM), using term frequency inverse document frequency (TF-IDF) vectorization, and makes comparisons with support vector machine (SVM), Naive Bayes (NB), logistic regression (LR), XGBoost (XGB), random forest (RF), K-nearest neighbor (k-NN), artificial neural network (ANN), and bidirectional encoder representations from transformers (BERT) models. To validate and authenticate our proposed work, we obtained and classified a real-time Twitter data stream of a trending topic using Twitter API into two classes: hate speech and nonhate speech. The precision, recall, and F1 score achieved by LSTM are 0.98, 0.99, and 0.98, respectively. The accuracy of LSTM for detecting hateful sentiment was found to be 97%, surpassing the accuracy of other models.
引用
收藏
页码:5028 / 5037
页数:10
相关论文
共 42 条
[1]   Combating hate speech using an adaptive ensemble learning model with a case study on COVID-19 [J].
Agarwal, Shivang ;
Chowdary, C. Ravindranath .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 185
[2]   Zero-Shot Hate to Non-Hate Text Conversion Using Lexical Constraints [J].
Ahmad, Zishan ;
Sujeeth, Vinnakota Sai ;
Ekbal, Asif .
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (05) :2479-2488
[3]   Deep Explainable Hate Speech Active Learning on Social-Media Data [J].
Ahmed, Usman ;
Lin, Jerry Chun-Wei .
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (04) :4625-4635
[4]   Arabic Questions Classification Using Modified TF-IDF [J].
Alammary, Ali Saleh .
IEEE ACCESS, 2021, 9 :95109-95122
[5]   An Ensemble Method for Radicalization and Hate Speech Detection Online Empowered by Sentic Computing [J].
Araque, Oscar ;
Iglesias, Carlos A. .
COGNITIVE COMPUTATION, 2022, 14 (01) :48-61
[6]  
Bhoi A., 2021, DATA ENG INTELLIGENT, P619
[7]  
Bisht A., 2020, Recent Trends in Image and Signal Processing in Computer Vision, V1124, P243, DOI 10.1007/978-981-15-2740-1_17
[8]  
Bohra A., 2018, P 2 WORKSHOP COMPUTA, P36, DOI [DOI 10.18653/V1/W18, 10.18653/v1/W18-1105, DOI 10.18653/V1/W18-1105]
[9]  
Briliani A, 2019, 2019 IEEE INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND INTELLIGENCE SYSTEM (IOTAIS), P98, DOI [10.1109/IoTaIS47347.2019.8980398, 10.1109/iotais47347.2019.8980398]
[10]   Detecting and Monitoring Hate Speech in Twitter [J].
Carlos Pereira-Kohatsu, Juan ;
Quijano-Sanchez, Lara ;
Liberatore, Federico ;
Camacho-Collados, Miguel .
SENSORS, 2019, 19 (21)