A Classifier Ensemble for Offensive Text Detection

被引：15

作者：

Pelle, Rogers ^{[1
]}

Alcantara, Cleber ^{[1
]}

Moreira, Viviane P. ^{[1
]}

机构：

[1] Univ Fed Rio Grande do Sul, Inst Informat, Bento Goncalves 9500, BR-91501970 Porto Alegre, RS, Brazil

来源：

WEBMEDIA'18: PROCEEDINGS OF THE 24TH BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB | 2018年

关键词：

Text Classification; Hate Speech Detection;

D O I：

10.1145/3243082.3243111

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Offensive posts are a constant nuisance in many Web platforms. As a consequence, there has been growing interest in devising methods to automatically identify such posts. In this paper, we present Hate2Vec - an approach for detecting offensive comments on the Web. Hate2Vec relies on a classifier ensemble. The base learners include: (i) a lexicon-based classifier which leverages the semantic relatedness of word embeddings; (ii) a logistic regression classifier based on comment embeddings; (iii) and a standard bag-of-words (BOW) classifier based on unigram features. Our experiments with datasets in English and Portuguese have yielded high classification results (F-measure above 0.9) and significantly outperformed a traditional BOW classifier.

引用

页码：237 / 243

页数：7

共 35 条

[1] Aghaei S., 2012, INT J WEB SEMANTIC T, V3, P1, DOI DOI 10.5121/IJWEST.2012.3101
[2] [Anonymous], 2017, P 23 BRAZILLIAN S MU, DOI DOI 10.1145/3126858.3131576
[3] [Anonymous], 2009, P CONTENT ANAL WEB
[4] [Anonymous], 2017, Proceedings of the First Workshop on Abusive Language Online, DOI 10.18653/v1/W17-3004
[5] [Anonymous], 2013, INT C LEARNING REPRE
[6] Bretschneider U, 2017, PROCEEDINGS OF THE 50TH ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, P2213
[7] Burnap P, 2014, P C INT POL POL, P1, DOI DOI 10.1002/POI3.85
[8] Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making
Burnap, Pete
Williams, Matthew L.
[J]. POLICY AND INTERNET, 2015, 7 (02): : 223 - 242
[9] Us and them: identifying cyber hate on Twitter across multiple protected characteristics
Burnap, Pete
Williams, Matthew L.
[J]. EPJ DATA SCIENCE, 2016, 5
[10] Mean Birds: Detecting Aggression and Bullying on Twitter
Chatzakou, Despoina
Kourtellis, Nicolas
Blackburn, Jeremy
De Cristofaro, Emiliano
Stringhini, Gianluca
Vakali, Athena
[J]. PROCEEDINGS OF THE 2017 ACM WEB SCIENCE CONFERENCE (WEBSCI '17), 2017, : 13 - 22

← 1 2 3 4 →