Spam Image Clustering for Identifying Common Sources of Unsolicited Emails

被引:6
作者
Zhang, Chengcui [1 ]
Chen, Xin [1 ]
Chen, Wei-Bang [1 ]
Yang, Lin [1 ]
Warner, Gary [1 ]
机构
[1] Univ Alabama Birmingham, Comp & Informat Sci, Birmingham, AL 35233 USA
关键词
Botnet; Clustering; Computer Forensics; Cybercrime; Data Mining; Spam Image;
D O I
10.4018/jdcf.2009070101
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this article, we propose a spam image clustering approach that uses data mining techniques to study the image attachments of spam emails with the goal to help the investigation of spam clusters or phishing groups. Spam images are first modeled based on their visual features. In particular, the foreground text layout, foreground picture illustrations and background textures are analyzed. After the visual features are extracted from spam images, we use an unsupervised clustering algorithm to group visually similar spam images into clusters. The clustering results are evaluated by visual validation since there is no prior knowledge as to the actual sources of spam images. Our initial results show that the proposed approach is effective in identifying the visual similarity between spam images and thus can provide important indications of the common source of spam images.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 6 条
[1]   A Heuristic-Based Feature Selection Method for Clustering Spam Emails [J].
Song, Jungsuk ;
Eto, Masashi ;
Kim, Hyung Chan ;
Inoue, Daisuke ;
Nakao, Koji .
NEURAL INFORMATION PROCESSING: THEORY AND ALGORITHMS, PT I, 2010, 6443 :290-297
[2]   Efficient Clustering of Emails Into Spam and Ham: The Foundational Study of a Comprehensive Unsupervised Framework [J].
Karim, Asif ;
Azam, Sami ;
Shanmugam, Bharanidharan ;
Kannoorpatti, Krishnan .
IEEE ACCESS, 2020, 8 :154759-154788
[3]   Hybrid Convolutional Autoencoder-Hierarchical Clustering Algorithm To Reveal Image Spam Sources [J].
Lu, Yongjin ;
Chen, Wei-Bang ;
Ailsworth, Zanyan ;
Wang, Xiaoliang ;
Zhang, Chengcui ;
Li, Kaixuan .
2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, :28-33
[4]   An Unsupervised Approach for Content-Based Clustering of Emails Into Spam and Ham Through Multiangular Feature Formulation [J].
Karim, Asif ;
Azam, Sami ;
Shanmugam, Bharanidharan ;
Kannoorpatti, Krishnan .
IEEE ACCESS, 2021, 9 :135186-135209
[5]   Enhancing Multimodal Clustering Framework with Deep Learning to Reveal Image Spam Authorship [J].
Chen, Wei-Bang ;
Lu, Yongjin ;
Ailsworth, Zanyah ;
Wang, Xiaoliang ;
Zhang, Chengcui .
2021 IEEE 22ND INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2021), 2021, :193-200
[6]   Clustering image noise patterns by embedding and visualization for common source camera detection [J].
Georgievska, Sonja ;
Bakhshi, Rena ;
Gavai, Anand ;
Sclocco, Alessio ;
van Werkhoven, Ben .
DIGITAL INVESTIGATION, 2017, 23 :22-30