A Classifier Ensemble for Offensive Text Detection

被引:15
作者
Pelle, Rogers [1 ]
Alcantara, Cleber [1 ]
Moreira, Viviane P. [1 ]
机构
[1] Univ Fed Rio Grande do Sul, Inst Informat, Bento Goncalves 9500, BR-91501970 Porto Alegre, RS, Brazil
来源
WEBMEDIA'18: PROCEEDINGS OF THE 24TH BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB | 2018年
关键词
Text Classification; Hate Speech Detection;
D O I
10.1145/3243082.3243111
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Offensive posts are a constant nuisance in many Web platforms. As a consequence, there has been growing interest in devising methods to automatically identify such posts. In this paper, we present Hate2Vec - an approach for detecting offensive comments on the Web. Hate2Vec relies on a classifier ensemble. The base learners include: (i) a lexicon-based classifier which leverages the semantic relatedness of word embeddings; (ii) a logistic regression classifier based on comment embeddings; (iii) and a standard bag-of-words (BOW) classifier based on unigram features. Our experiments with datasets in English and Portuguese have yielded high classification results (F-measure above 0.9) and significantly outperformed a traditional BOW classifier.
引用
收藏
页码:237 / 243
页数:7
相关论文
共 35 条
  • [1] Aghaei S., 2012, INT J WEB SEMANTIC T, V3, P1, DOI DOI 10.5121/IJWEST.2012.3101
  • [2] [Anonymous], 2017, P 23 BRAZILLIAN S MU, DOI DOI 10.1145/3126858.3131576
  • [3] [Anonymous], 2009, P CONTENT ANAL WEB
  • [4] [Anonymous], 2017, Proceedings of the First Workshop on Abusive Language Online, DOI 10.18653/v1/W17-3004
  • [5] [Anonymous], 2013, INT C LEARNING REPRE
  • [6] Bretschneider U, 2017, PROCEEDINGS OF THE 50TH ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, P2213
  • [7] Burnap P, 2014, P C INT POL POL, P1, DOI DOI 10.1002/POI3.85
  • [8] Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making
    Burnap, Pete
    Williams, Matthew L.
    [J]. POLICY AND INTERNET, 2015, 7 (02): : 223 - 242
  • [9] Us and them: identifying cyber hate on Twitter across multiple protected characteristics
    Burnap, Pete
    Williams, Matthew L.
    [J]. EPJ DATA SCIENCE, 2016, 5
  • [10] Mean Birds: Detecting Aggression and Bullying on Twitter
    Chatzakou, Despoina
    Kourtellis, Nicolas
    Blackburn, Jeremy
    De Cristofaro, Emiliano
    Stringhini, Gianluca
    Vakali, Athena
    [J]. PROCEEDINGS OF THE 2017 ACM WEB SCIENCE CONFERENCE (WEBSCI '17), 2017, : 13 - 22