Detecting Spam Tweets in Trending Topics Using Graph-Based Approach

被引:2
作者
Paudel, Ramesh [1 ]
Kandel, Prajjwal [1 ]
Eberle, William [1 ]
机构
[1] Tennessee Technol Univ, Cookeville, TN 38501 USA
来源
PROCEEDINGS OF THE FUTURE TECHNOLOGIES CONFERENCE (FTC) 2019, VOL 1 | 2020年 / 1069卷
关键词
Twitter; Spam detection; Anomaly Detection; Graph-based anomaly; IMBALANCED DATA; TWITTER;
D O I
10.1007/978-3-030-32520-6_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, social media has changed the way people communicate and share information. For example, when some important and noteworthy event occurs, many people like to "tweet" (Twitter) or post information, resulting in the event trending and becoming more popular. Unfortunately, spammers can exploit trending topics to spread spam more quickly and to a wider audience. Recently, researchers have applied various machine learning techniques on accounts and messages to detect spam on Twitter. However, the features of typical tweets can be easily fabricated by the spammers. In this work, we propose a graph-based approach that leverages the relationship between the named entities present in the content of the tweet and the document referenced by the URL mentioned in the tweet for detecting possible spam. It is our hypothesis that by combining multiple, heterogeneous information together into a single graph representation, we can discover unusual patterns in the data that reveal spammer activities - structural features that are difficult for spammers to fabricate. We will demonstrate the usefulness of this approach by collecting tweets and documents referenced by the URL in the tweet related to Twitter trending topics, and running graph-based anomaly detection algorithms on a graph representation of the data, in order to effectively detect anomalies on trending tweets.
引用
收藏
页码:526 / 546
页数:21
相关论文
共 36 条
[1]   Graph based anomaly detection and description: a survey [J].
Akoglu, Leman ;
Tong, Hanghang ;
Koutra, Danai .
DATA MINING AND KNOWLEDGE DISCOVERY, 2015, 29 (03) :626-688
[2]  
Ameen AsoKhaleel., 2017, International Journal of Applied Mathematics, Electronics and Computers, V5, P71
[3]  
Anantharam P, 2012, PROCEEDINGS OF THE 3RD ANNUAL ACM WEB SCIENCE CONFERENCE, 2012, P11
[4]  
[Anonymous], 2015, P 28 INT FLOR ART IN
[5]  
[Anonymous], 2013, P S NETW DISTR SYST
[6]  
Benevenuto Fabricio., 2010, CEAS
[7]  
Bird S., 2009, NATURAL LANGUAGE PRO
[8]   Learning to Detect Misleading Content on Twitter [J].
Boididou, Christina ;
Papadopoulos, Symeon ;
Apostolidis, Lazaros ;
Kompatsiaris, Yiannis .
PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, :283-291
[9]   SMOTEBoost: Improving prediction of the minority class in boosting [J].
Chawla, NV ;
Lazarevic, A ;
Hall, LO ;
Bowyer, KW .
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119
[10]  
Chen C, 2015, IEEE ICC, P7065, DOI 10.1109/ICC.2015.7249453