Detecting Arabic Spammers and Content Polluters on Twitter

被引:0
作者
El-Mawass, Nour [1 ]
Alaboodi, Saad [1 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Riyadh, Saudi Arabia
来源
2016 SIXTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION PROCESSING AND COMMUNICATIONS (ICDIPC) | 2016年
关键词
Online Social Networks; Social Spam Detection; Machine Learning; Supervised Classification; Twitter; Arabic Spam;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Spam is thriving on Arabic Twitter. With a large online population, a mounting political unrest, and an undersized and unspecialized response effort, the current state of Arabic online social networks (OSNs) offers a perfect target for the spam industry, bringing both abuse and manipulation to the scene. The result is a ubiquitous spam presence that redefines the signal to noise ratio, and makes spam a de facto component of the online social platforms. English spam on online social networks has been heavily studied in the literature. To date however, social spam in other languages has been largely ignored. Our own analysis of spam content on Arabic trending hash tags in Saudi Arabia results in an estimate of about three quarters of the total generated content. This alarming rate, backed by independent concurrent estimates, makes the development of adaptive spam detection techniques a very real and pressing need. In this study, we present a first attempt at detecting accounts that promote spam and content pollution on Arabic Twitter. Using a large crawled dataset of more than 23 million Arabic tweets, and a manually labeled sample of more than 5000 tweets, we analyze the spam content on Saudi Twitter, and assess the performance of previous spam detection features on our recently gathered dataset. We also adapt the previously proposed features to respond to spammers evading techniques, and use these features to build a new highly accurate data-driven detection system.
引用
收藏
页码:53 / 58
页数:6
相关论文
共 18 条
[1]  
Abdurabb K., 2014, SAUDI ARABIA HAS HIG
[2]   Detecting Automatically-Generated Arabic Tweets [J].
Almerekhi, Hind ;
Elsayed, Tamer .
INFORMATION RETRIEVAL TECHNOLOGY, AIRS 2015, 2015, 9460 :123-134
[3]  
[Anonymous], COMM SYST NETW COMSN
[4]  
[Anonymous], 2009, 1 MONDAY
[5]  
[Anonymous], 2010, P INT AAAI C WEB SOC, DOI DOI 10.1609/ICWSM.V4I1.14033
[6]   Detecting Spammers and Content Promoters in Online Video Social Networks [J].
Benevenuto, Fabricio ;
Rodrigues, Tiago ;
Almeida, Virgilio ;
Almeida, Jussara ;
Goncalves, Marcos .
PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, :620-627
[7]  
Benevenuto Fabricio., 2010, CEAS
[8]  
Beutel A., 2013, P 22 INT C WORLD WID, P119, DOI DOI 10.1145/2488388.2488400
[9]  
Chu Z, 2010, 26TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2010), P21
[10]   Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? [J].
Chu, Zi ;
Gianvecchio, Steven ;
Wang, Haining ;
Jajodia, Sushil .
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2012, 9 (06) :811-824