Platform-Oblivious Anti-Spam Gateway

被引:0
作者
Zhang, Yihe [1 ]
Yuan, Xu [1 ]
Tzeng, Nian-Feng [1 ]
机构
[1] Univ Louisiana Lafayette, Lafayette, LA 70504 USA
来源
37TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, ACSAC 2021 | 2021年
关键词
Anti-Spam; Unsupervised; Outlier Detection;
D O I
10.1145/3485832.3488024
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper addresses a novel anti-spam gateway targeting multiple linguistic-based social platforms to expose the outlier property of their spam messages uniformly for effective detection. Instead of labeling ground truth datasets and extracting key features, which are labor-intensive and time-consuming, we start with coarsely mining seed corpora of spams and hams from the target data (aiming for spam classification), before reconstructing them as the reference. To catch each word's rich information in the semantic and syntactic perspectives, we then leverage the natural language processing (NLP) model to embed each word into the high-dimensional vector space and use a neural network to train a spam word model. After that, each message is encoded by using the predicted spam scores from this model for all included stem words. The encoded messages are processed by the prominent outlier techniques to produce their respective scores, allowing us to rank them for making the outlier visible. Our solution is unsupervised, without relying on specifics of any platform or dataset, to be platform-oblivious. Through extensive experiments, our solution is demonstrated to expose spammers' outlier characteristics effectively, outperform all examined unsupervised methods in almost all metrics, and may even better supervised counterparts.
引用
收藏
页码:1064 / 1077
页数:14
相关论文
共 23 条
[21]   Semi-automatic Information Extraction from Discussion Boards with Applications for Anti-Spam Technology [J].
Sarencheh, Saeed ;
Potdar, Vidyasagar ;
Yeganeh, Elham Afsari ;
Firoozeh, Nazanin .
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2010, PT 2, PROCEEDINGS, 2010, 6017 :370-+
[22]   Social network analysis of web links to eliminate false positives in collaborative anti-spam systems [J].
Sadan, Zac ;
Schwartz, David G. .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2011, 34 (05) :1717-1723
[23]   PhiShield: An AI-Based Personalized Anti-Spam Solution with Third-Party Integration [J].
Mun, Hyunsol ;
Park, Jeeeun ;
Kim, Yeonhee ;
Kim, Boeun ;
Kim, Jongkil .
ELECTRONICS, 2025, 14 (08)