HateClassify: A Service Framework for Hate Speech Identification on Social Media

被引:24
作者
Khan, Muhammad U. S. [1 ]
Abbas, Assad [2 ]
Rehman, Attiqa [1 ]
Nawaz, Raheel [3 ,4 ]
机构
[1] COMSATS Univ Islamabad, Abbottabad 22010, Pakistan
[2] COMSATS Univ Islamabad, Comp Sci, Islamabad 45550, Pakistan
[3] Manchester Metropolitan Univ, Digital Technol Solut, Manchester M15 6BH, Lancs, England
[4] Manchester Metropolitan Univ, Analyt & Digital Educ, Manchester M15 6BH, Lancs, England
关键词
Social networking (online); Voice activity detection; Internet; Support vector machines; Blogs; Training; Logistics;
D O I
10.1109/MIC.2020.3037034
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
It is indeed a challenge for the existing machine learning approaches to segregate the hateful content from the one that is merely offensive. One prevalent reason for low accuracy of hate detection with the current methodologies is that these techniques treat hate classification as a multiclass problem. In this article, we present the hate identification on the social media as a multilabel problem. To this end, we propose a CNN-based service framework called "HateClassify" for labeling the social media contents as the hate speech, offensive, or nonoffensive. Results demonstrate that the multiclass classification accuracy for the CNN-based approaches particularly sequential CNN (SCNN) is competitive and even higher than certain state-of-the-art classifiers. Moreover, in the multilabel classification problem, sufficiently high performance is exhibited by the SCNN among other CNN-based techniques. The results have shown that using multilabel classification instead of multiclass classification, hate speech detection is increased up to 20%.
引用
收藏
页码:40 / 49
页数:10
相关论文
共 17 条
  • [1] Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making
    Burnap, Pete
    Williams, Matthew L.
    [J]. POLICY AND INTERNET, 2015, 7 (02): : 223 - 242
  • [2] Detecting Offensive Language in Social Media to Protect Adolescent Online Safety
    Chen, Ying
    Zhou, Yilu
    Zhu, Sencun
    Xu, Heng
    [J]. PROCEEDINGS OF 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY, RISK AND TRUST AND 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM/PASSAT 2012), 2012, : 71 - 80
  • [3] Davidson T., 2017, ICWSM, P512
  • [4] Del Vigna Fabio, 2017, P 1 IT C CYB ITASEC1, P86
  • [5] Hate Speech Detection with Comment Embeddings
    Djuric, Nemanja
    Zhou, Jing
    Morris, Robin
    Grbovic, Mihajlo
    Radosavljevic, Vladan
    Bhamidipati, Narayan
    [J]. WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 29 - 30
  • [6] Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network
    Hosseinmardi, Homa
    Mattson, Sabrina Arredondo
    Ibn Rafiq, Rahat
    Han, Richard
    Lv, Qin
    Mishra, Shivakant
    [J]. SOCIAL INFORMATICS (SOCINFO 2015), 2015, 9471 : 49 - 66
  • [7] Kim Y., 2014, ARXIV14085882, P1, DOI [10.3115/v1/D14-1181, DOI 10.3115/V1/D14-1181]
  • [8] Mehdad Y., 2016, P 17 ANN M SPEC INT, P299, DOI DOI 10.18653/V1/W16-3638
  • [9] Abusive Language Detection in Online User Content
    Nobata, Chikashi
    Tetreault, Joel
    Thomas, Achint
    Mehdad, Yashar
    Chang, Yi
    [J]. PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16), 2016, : 145 - 153
  • [10] Sorower M. S., 2010, A literature survey on algorithms for multi-label learning, V18, P1