A literature survey on multimodal and multilingual automatic hate speech identification

被引：38

作者：

Chhabra, Anusha ^{[1
]}

Vishwakarma, Dinesh Kumar ^{[1
]}

机构：

[1] Delhi Technol Univ, Dept Informat Technol, Biometr Res Lab, Delhi 110042, India

来源：

MULTIMEDIA SYSTEMS | 2023年 / 29卷 / 03期

关键词：

Hate speech; Multilingual; Multimodal; Machine learning; Deep learning; Online social media; ONLINE COMMUNICATION; SOCIAL MEDIA; TWITTER; DATASET;

D O I：

10.1007/s00530-023-01051-8

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Social media is a more common and powerful platform for communication to share views about any topic or article, which consequently leads to unstructured toxic, and hateful conversations. Curbing hate speeches has emerged as a critical challenge globally. In this regard, Social media platforms are using modern statistical tools of AI technologies to process and eliminate toxic data to minimize hate crimes globally. Demanding the dire need, machine and deep learning-based techniques are getting more attention in analyzing these kinds of data. This survey presents a comprehensive analysis of hate speech definitions along with the motivation for detection and standard textual analysis methods that play a crucial role in identifying hate speech. State-of-the-art hate speech identification methods are also discussed, highlighting handcrafted feature-based and deep learning-based algorithms by considering multimodal and multilingual inputs and stating the pros and cons of each. Survey also presents popular benchmark datasets of hate speech/offensive language detection specifying their challenges, the methods for achieving top classification scores, and dataset characteristics such as the number of samples, modalities, language(s), number of classes, etc. Additionally, performance metrics are described, and classification scores of popular hate speech methods are mentioned. The conclusion and future research directions are presented at the end of the survey. Compared with earlier surveys, this paper gives a better presentation of multimodal and multilingual hate speech detection through well-organized comparisons, challenges, and the latest evaluation techniques, along with their best performances.

引用

页码：1203 / 1230

页数：28

共 170 条

[1]

Abdelfatah K.E., 2017, Computer Science & Information Technology (CS & IT), P1

[2]

Abdul-Mageed Muhammad, 2018, 2 WORKSHOP TEXT ANAL

[3]

Abozinadah EA, 2015, Int J Knowl Eng, V1, P113, DOI [DOI 10.7763/IJKE.2015.V1.19, 10.7763/IJKE.2015.V1.19]

[4]

Abozinadah Ehab A., 2017, ICCDA 17

[5]

Abro S, 2020, INT J ADV COMPUT SC, V11, P484

[6]

Agarwal, 2017, ARXIV

[7]

Agarwal S, 2015, LECT NOTES COMPUT SC, V8956, P431, DOI 10.1007/978-3-319-14977-6_47

[8] Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning [J].

Ahmed, Zo ;

Vidgen, Bertie ;

Hale, Scott A. .

EPJ DATA SCIENCE, 2022, 11 (01)

[9]

Al-Hassan A., 2019, P 6 INT C COMPUTER S, P83, DOI DOI 10.5121/CSIT.2019.90208

[10] Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach [J].

Al-Makhadmeh, Zafer ;

Tolba, Amr .

COMPUTING, 2020, 102 (02) :501-522

← 1 2 3 4 5 6 7 8 9 10 →