Towards safer online communities: Deep learning and explainable AI for hate speech detection and classification

被引：10

作者：

Kibriya, Hareem ^{[1
]}

Siddiqa, Ayesha ^{[1
]}

Khan, Wazir Zada ^{[1
]}

Khan, Muhammad Khurram ^{[2
]}

机构：

[1] Univ Wah, Dept Comp Sci, Wah Cantt 47040, Pakistan

[2] King Saud Univ, Ctr Excellence Informat Assurance, Riyadh 11451, Saudi Arabia

来源：

COMPUTERS & ELECTRICAL ENGINEERING | 2024年 / 116卷

关键词：

Hate speech detection; Social media; Deep learning; Explainable Artificial Intelligence; Machine learning; Toxic comments; Hate speech;

D O I：

10.1016/j.compeleceng.2024.109153

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The internet and social media facilitate widespread idea sharing but also contribute to cybercrimes and harmful behaviors, notably the dissemination of abusive and hateful speech, which poses a significant threat to societal cohesion. Hence, prompt and accurate detection of such harmful content is crucial. To address this issue, our study introduces a fully automated end-toend model for hate speech detection and classification using Natural Language Processing and Deep Learning techniques. The proposed architecture comprising embedding, Convolutional, bidirectional Recurrent Neural Network, and bidirectional Long Short Term Memory layers, achieved the highest accuracy of 98.5%. Additionally, we employ explainable AI techniques, such as SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), to gain insights into the performance of the proposed framework. This comprehensive approach meets the pressing demand for swift and precise detection and categorization of harmful online content.

引用

页数：15

共 24 条

[21] Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model [J].