TSD: Detecting Sybil Accounts in Twitter

被引:34
作者
Alsaleh, Mansour [1 ]
Alarifi, Abdulrahman [1 ]
Al-Salman, AbdulMalik [2 ]
AlFayez, Mohammed [2 ]
Almuhaysin, Abdulmajeed [2 ]
机构
[1] King Abdulaziz City Sci & Technol, Comp Res Inst, Riyadh, Saudi Arabia
[2] King Saud Univ, Dept Comp Sci, Riyadh, Saudi Arabia
来源
2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA) | 2014年
关键词
Social networks; Sybil account; Fake user accounts; Twitter; Web spam; Content spam; Spamdexing;
D O I
10.1109/ICMLA.2014.81
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fake identities and user accounts (also called "Sybils") in online communities represent today a treasure for adversaries to spread fake product reviews, malware and spam on social networks, and astroturf political campaigns. State-of-the-art in the defense mechanisms includes Automated Turing Tests (ATTs such as CAPTCHAs) and graph-based Sybil detectors. Sybil detectors in social networks leverage the assumption that Sybils will find it hard to befriend real users which leads to Sybils being connected to each other forming strongly connected sub graphs that can be detected using graph theory. However, the large majority of Sybils are in fact successful in integrating themselves into real user communities (such as the case in Twitter and Facebook). In this paper, we first study and compare the current detection mechanisms of Sybil accounts. We also explore various types of Twitter Sybil accounts detection features with the objective of building an effective and practical classifier. In order to build and evaluate our classifier, we collect and manually label a dataset of twitter accounts, including human users, bots, and hybrid (i.e., tweets are posted by both human and bots). We believe this Twitter Sybils corpus will help researchers in conducting sound measurement studies. We also develop a browser plug-in (that we call Twitter Sybils Detector or TSD for short) that utilizes our classifier and warns the user about possible Sybil accounts before accessing them, upon clicking on a Twitter account.
引用
收藏
页码:463 / 469
页数:7
相关论文
共 22 条
  • [1] Google Penguin: Evasion in Non-English Languages and a New Classifier
    Alarifi, Abdulrahman
    Alsaleh, Mansour
    Al-Salman, AbdulMalik
    Alswayed, AbdulMajeed
    Alkhaledi, Ahmad
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 2, 2013, : 274 - 280
  • [2] Web Spam: a Study of the Page Language Effect on the Spam Detection Features
    Alarifi, Abdulrahman
    Alsaleh, Mansour
    [J]. 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 216 - 221
  • [3] Almaatouq Abdullah., 2014, Proceedings of the 2014 ACM Conference on Web Science, WebSci'14, P33
  • [4] Androutsopoulos I., 2000, P WORKSH MACH LEARN
  • [5] [Anonymous], 2004, VLDB
  • [6] [Anonymous], LOYAL USERS GOOGLE P
  • [7] [Anonymous], 2006, P 15 INT C WORLD WID, DOI DOI 10.1145/1135777.1135794
  • [8] [Anonymous], P 1 INT WORKSH ADV I
  • [9] [Anonymous], 2013, P 22 USENIX SEC S WA
  • [10] [Anonymous], 2012, ACM SIGKDD Explorations Newsletter, DOI [DOI 10.1145/2207243.2207252, 10.1145/2207243.2207252]