Machine Learning and feature engineering-based study into sarcasm and irony classification with application to cyberbullying detection

被引:49
作者
Chia, Zheng Lin [1 ]
Ptaszynski, Michal [1 ]
Masui, Fumito [1 ]
Leliwa, Gniewosz [2 ]
Wroczynski, Michal [2 ]
机构
[1] Kitami Inst Technol, Dept Comp Sci, Kitami, Hokkaido, Japan
[2] Samurailabs, Gdansk, Poland
关键词
Irony detection; Sarcasm detection; Machine Learning;
D O I
10.1016/j.ipm.2021.102600
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Irony and sarcasm detection is considered a complex task in Natural Language Processing. This paper set out to explore the sarcasm and irony on Twitter, using Machine Learning and Feature Engineering techniques. First we review and clarify the definition of irony and sarcasm by discussing various studies focusing on the terms. Next the first experiment is conducted comparing between various types of classification methods including some popular classifiers for text classification task. For the second experiment, different types of data preprocessing methods were compared and analyzed. Finally, the relationship between irony, sarcasm, and cyberbullying are discussed. The results are interesting as we observed high similarity between them.
引用
收藏
页数:12
相关论文
共 64 条
  • [41] Poria, 2016, ARXIV161008815, P1601
  • [42] Porter Porter M M, PORTER STEMMING ALGO
  • [43] Potamias Rolandos Alexandros, 2019, TRANSFORMER BASED AP
  • [44] Ptaszynski M, 2018, 22018 U SCI TECHN DE
  • [45] Ptaszynski M., 2010, Int. J. Comput. Linguist. Res, V1, P135
  • [46] A multidimensional approach for detecting irony in Twitter[J]. Reyes, Antonio;Rosso, Paolo;Veale, Tony. LANGUAGE RESOURCES AND EVALUATION, 2013(01)
  • [47] From humor recognition to irony detection: The figurative language of social media[J]. Reyes, Antonio;Rosso, Paolo;Buscaldi, Davide. DATA & KNOWLEDGE ENGINEERING, 2012
  • [48] Reynolds K., 2011, Proceedings of the 2011 Tenth International Conference on Machine Learning and Applications (ICMLA 2011), P241, DOI 10.1109/ICMLA.2011.152
  • [49] Riloff E., 2013, P 2013 C EMP METH NA, P704
  • [50] Rosenthal Sara, 2014, Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), P73