Detecting Spammers and Content Promoters in Online Video Social Networks

被引:95
作者
Benevenuto, Fabricio [1 ]
Rodrigues, Tiago [1 ]
Almeida, Virgilio [1 ]
Almeida, Jussara [1 ]
Goncalves, Marcos [1 ]
机构
[1] Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil
来源
PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2009年
关键词
social networks; social media; video response; video spam; video promotion; spammer; promoter;
D O I
10.1145/1571941.1572047
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A number of online video social networks, out of which YouTube is the most popular, provides features that allow users to post a video as a response to a discussion topic. These features open opportunities for users to introduce polluted content, or simply pollution, into the system. For instance, spammers may post an unrelated video as response to a popular one aiming at increasing the likelihood of the response being viewed by a larger number of users. Moreover, opportunistic users - promoters - may try to gain visibility to a specific video by posting a large number of (potentially unrelated) responses to boost the rank of the responded video, making it appear in the top lists maintained by the system. Content pollution may jeopardize the trust of users on the system, thus compromising its success in promoting social interactions. In spite of that, the available literature is very limited in providing a deep understanding of this problem. In this paper, we go a step further by addressing the issue of detecting video spammers and promoters. Towards that end, we manually build a test collection of real YouTube users, classifying them as spammers, promoters, and legitimates. Using our test collection, we provide a characterization of social and content attributes that may help distinguish each user class. We also investigate the feasibility of using a state-of-the-art supervised classification algorithm to detect spammers and promoters, and assess its effectiveness in our test collection. We found that our approach is able to correctly identify the majority of the promoters, misclassifying only a small percentage of legitimate users. In contrast, although we are able to detect a significant fraction of spammers, they showed to be much harder to distinguish from legitimate users.
引用
收藏
页码:620 / 627
页数:8
相关论文
共 31 条
  • [1] [Anonymous], INT WORLD WID WEB C
  • [2] [Anonymous], 2006, Google's PageRank and beyond: the science of search engine rankings
  • [3] [Anonymous], ACM COMPUTING SURVEY
  • [4] BENEVENUTO F, 2008, ACM MULTIMEDIA MM
  • [5] BENEVENUTO F, 2009, INT WORKSH ADV INF R
  • [6] Boll S., 2007, IEEE MULTIMEDIA, V14
  • [7] Brin S., 1998, INT WORLD WID WEB C
  • [8] CASTILLO C, 2007, INT ACM SIGIR
  • [9] CHA M, 2007, INT MEAS C IMC
  • [10] DOUGLIS F, 2008, IEEE INTERNET COMPUT, V12