TubeSpam: Comment Spam Filtering on YouTube

被引:68
作者
Alberto, Tulio C. [1 ]
Lochter, Johannes V. [1 ]
Almeida, Tiago A. [1 ]
机构
[1] Fed Univ Sao Carlos UFSCar, Dept Comp Sci, BR-18052780 Sao Paulo, Brazil
来源
2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA) | 2015年
关键词
D O I
10.1109/ICMLA.2015.37
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The profitability promoted by Google in its brand new video distribution platform YouTube has attracted an increasing number of users. However, such success has also attracted malicious users, which aim to self-promote their videos or disseminate viruses and malwares. Since YouTube offers limited tools for comment moderation, the spam volume is shockingly increasing which lead owners of famous channels to disable the comments section in their videos. Automatic comment spam filtering on YouTube is a challenge even for established classification methods, since the messages are very short and often rife with slangs, symbols and abbreviations. In this work, we have evaluated several top-performance classification techniques for such purpose. The statistical analysis of results indicate that, with 99.9% of confidence level, decision trees, logistic regression, Bernoulli Naive Bayes, random forests, linear and Gaussian SVMs are statistically equivalent. Based on this, we have also offered the TubeSpam - an accurate online system to filter comments posted on YouTube.
引用
收藏
页码:138 / 143
页数:6
相关论文
共 23 条
[1]  
Alberto T. C., 2013, AN 10 ENC NAC INT AR
[2]  
Almeida Tiago A., 2011, Journal of Internet Services and Applications, V1, P183, DOI 10.1007/s13174-010-0014-7
[3]   Occam's razor-based spam filter [J].
Almeida, Tiago A. ;
Yamakami, Akebo .
JOURNAL OF INTERNET SERVICES AND APPLICATIONS, 2012, 3 (03) :245-253
[4]  
[Anonymous], P INT C ART INT ICAI
[5]   Detecting Spammers and Content Promoters in Online Video Social Networks [J].
Benevenuto, Fabricio ;
Rodrigues, Tiago ;
Almeida, Virgilio ;
Almeida, Jussara ;
Goncalves, Marcos .
PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, :620-627
[6]  
Bratko A, 2006, J MACH LEARN RES, V7, P2673
[7]  
Campanha J. M., 2014, AN 11 ENC NAC INT AR
[8]  
Chaudhary V, 2013, ANN CONF PRIV SECUR, P195, DOI 10.1109/PST.2013.6596054
[9]  
Chowdury R, 2013, 2013 EIGHTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), P373, DOI 10.1109/ICDIM.2013.6694038
[10]  
[王迪 Di Wang], 2011, [植物学报, Chinese Bulletin of Botany], V46, P11