The Bag of Communities: Identifying Abusive Behavior Online with Preexisting Internet Data

被引:60
作者
Chandrasekharan, Eshwar [1 ]
Samory, Mattia [2 ]
Srinivasan, Anirudh [1 ]
Gilbert, Eric [1 ]
机构
[1] Georgia Inst Technol, Atlanta, GA 30332 USA
[2] Univ Padua, I-35122 Padua, Italy
来源
PROCEEDINGS OF THE 2017 ACM SIGCHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'17) | 2017年
基金
美国国家科学基金会;
关键词
social computing; online communities; abusive behavior; moderation; machine learning;
D O I
10.1145/3025453.3026018
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Since its earliest days, harassment and abuse have plagued the Internet. Recent research has focused on in-domain methods to detect abusive content and faces several challenges, most notably the need to obtain large training corpora. In this paper, we introduce a novel computational approach to address this problem called Bag of Communities (BoC)-a technique that leverages large-scale, preexisting data from other Internet communities. We then apply BoC toward identifying abusive behavior within a major Internet community. Specifically, we compute a post's similarity to 9 other communities from 4chan, Reddit, Voat and MetaFilter. We show that a BoC model can be used on communities "off the shelf" with roughly 75% accuracy-no training examples are needed from the target community. A dynamic BoC model achieves 91.18% accuracy after seeing 100,000 human-moderated posts, and uniformly outperforms in-domain methods. Using this conceptual and empirical work, we argue that the BoC approach may allow communities to deal with a range of common problems, like abusive behavior, faster and with fewer engineering resources.
引用
收藏
页码:3175 / 3187
页数:13
相关论文
共 65 条
[1]  
Adi Robertson, 2015, THE VERGE 0610
[2]  
[Anonymous], 2015, P INT AAAI C WEB SOC, DOI DOI 10.1609/ICWSM.V9I1.14583
[3]  
[Anonymous], 2013, P EMNLP
[4]  
[Anonymous], 2011, P INT AAAI C WEB SOC
[5]  
[Anonymous], 2009, FOX NEWS
[6]  
[Anonymous], 2003, HUM FAC ER
[7]  
[Anonymous], P C HUM FACT COMP SY
[8]  
[Anonymous], 2011, Proc. Icwsm2011
[9]  
[Anonymous], 2008, Introduction to information retrieval
[10]  
[Anonymous], 2000, Community Building on the Web: Secret strategies for successful online communities