A graph proximity feature augmentation approach for identifying accounts of terrorists on twitter

被引:5
作者
Aleroud, Ahmed [1 ,2 ]
Abu-Alsheeh, Nisreen [1 ]
Al-Shawakfa, Emad [1 ]
机构
[1] Yarmouk Univ, Dept Informat Syst, Irbid, Jordan
[2] Univ Maryland, Dept Informat Syst, Baltimore Cty UMBC, Baltimore, MD 21201 USA
关键词
Feature augmentation; Latent dirichlet allocation (lda); Social network analysis; Temporal analysis; Terrorism Informatics; Graph Neighborhood; MEDIA;
D O I
10.1016/j.cose.2070.107056
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the popularity of social networks, terrorist groups such as ISIS encouraged others to follow their activities, share their ideas, recruit fans, radicalize communities, and raise funds to support future attacks. This has led to the emergence of radicalized online accounts that belong to terrorists or their fans. Existing techniques for counter-terrorism investiga-tions which aim to suspend such accounts are based on reports by users or syntactic-based sentiment analysis techniques, which are not accurate on short texts shared by terrorist such as tweets. This work proposed a feature augmentation approach to enrich the content of tweets before investigating them to discover the radicalized online contents. The augmented tweets are then used to classify accounts into Pro-ISIS or Anti-ISIS categories. We utilized topic modeling as a baseline method for feature augmentation. We studied the effects of utilizing tweets at different time intervals on the quality of the generated models that classify tweets and the corresponding accounts. We then introduced a novel feature augmentation approach that utilizes Neighborhood Overlap, a graph proximity technique that discovers terms having a strong relationship with the Pro-ISIS category. Terms extracted from tweets are represented as nodes in a graph, which is then partitioned into clusters containing different terms. Terms in strongly connected parts of each cluster are augmented to the original term vectors of the tweets based on the similarity between those terms and each tweet. We compared our approach with other baseline augmentation techniques such Term to-Term correlation, Topic Modeling, and other existing techniques. Experimental results on a dataset containing Proand Anti-ISIS tweets show that our approach is quite promising to automate the identification of terrorist contents online. The results have shown that using graph proximity measures such as Neighborhood Overlap for term augmentation gains higher Precision, Recall, and F-measure than the typical approaches. In addition, we found that applying time-based analysis with term augmentation to identify radicalized accounts enhanced the Precision of the investigation process. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:25
相关论文
共 65 条
[21]  
De Smedt Tom, 2018, ARXIV180304596
[22]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[23]  
2-9
[24]  
Deng H., 2011, PROC ACM SIGKDD C KN, P1271, DOI DOI 10.1145/2020408.2020600
[25]  
Djaballah KA, 2019, 2019 SIXTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), P223, DOI [10.1109/SNAMS.2019.8931827, 10.1109/snams.2019.8931827]
[26]  
Easley D., 2012, SIGNIFICANCE, V9, P43, DOI [DOI 10.1111/J.1740-9713.2012.00546.X, 10.1111/j.1740-9713.2012.00546.x]
[27]  
Eisenstein J., 2013, Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, P359
[28]  
Faramondi L, 2019, IEEE SYS MAN CYBERN, P439, DOI [10.1109/smc.2019.8914665, 10.1109/SMC.2019.8914665]
[29]   Graph-Based Data-Collection Policies for the Internet of Things [J].
Fernandez, Maribel ;
Jaimunk, Jenjira ;
Thuraisingham, Bhavani .
4TH ANNUAL INDUSTRIAL CONTROL SYSTEM SECURITY WORKSHOP (ICSS 2018), 2018, :9-16
[30]   STOCHASTIC RELAXATION, GIBBS DISTRIBUTIONS, AND THE BAYESIAN RESTORATION OF IMAGES [J].
GEMAN, S ;
GEMAN, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1984, 6 (06) :721-741