CyberBERT: BERT for cyberbullying identificationBERT for cyberbullying identification

被引:0
作者
Sayanta Paul
Sriparna Saha
机构
[1] Indian Institute of Technology Patna,Department of Computer Science and Engineering
来源
Multimedia Systems | 2022年 / 28卷
关键词
Cyberbullying; Language model; Deep learning; BERT;
D O I
暂无
中图分类号
学科分类号
摘要
Cyberbullying can be delineated as a purposive and recurrent act, which is aggressive in nature, done via different social media platforms such as Facebook, Twitter, Instagram, and others. A state-of-the-art pre-training language model, BERT (Bidirectional Encoder Representations from Transformers), has achieved remarkable results in many language understanding tasks. In this paper, we present a novel application of BERT for cyberbullying identification. A straightforward classification model using BERT is able to achieve the state-of-the-art results across three real-world corpora: Formspring (∼12k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim 12\hbox {k}$$\end{document} posts), Twitter (∼16k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim 16\hbox {k}$$\end{document} posts), and Wikipedia (∼100k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim 100\hbox {k}$$\end{document} posts). Experimental results demonstrate that our proposed model achieves significant improvements over existing works, in comparison with the slot-gated or attention-based deep neural network models.
引用
收藏
页码:1897 / 1904
页数:7
相关论文
共 11 条
[1]  
Peter KS(2008)Cyberbullying: Its nature and impact in secondary school pupils J. Child Psychol. Psychiatry 49 376-385
[2]  
Reynolds K(2011)Using machine learning to detect cyberbullying Int. Conf. Mach. Learn. Appl. Workshop 2 241-244
[3]  
Kontostathis A(2020)Improving cyberbullying detection using Twitter users’ psychological features and machine learning Comput. Sec. 90 101710-1240
[4]  
Edwards L(2020)BioBERT: A pre-trained biomedical language representation model for biomedical text mining Bioinformatics 36 1234-357
[5]  
Balakrishnan V(2002)SMOTE: synthetic minority over-sampling technique J. Artif. Intell. Res. 16 321-1923
[6]  
Khan S(1998)Approximate statistical tests for comparing supervised classification learning algorithms Neural Comput. 10 1895-undefined
[7]  
Arabnia HR(2014)Scientific method: Statistical errors Nat. News 5067487 150-undefined
[8]  
Lee J(undefined)undefined undefined undefined undefined-undefined
[9]  
Chawla NV(undefined)undefined undefined undefined undefined-undefined
[10]  
Dietteric TG(undefined)undefined undefined undefined undefined-undefined