Spam Image Clustering for Identifying Common Sources of Unsolicited Emails

被引:6
作者
Zhang, Chengcui [1 ]
Chen, Xin [1 ]
Chen, Wei-Bang [1 ]
Yang, Lin [1 ]
Warner, Gary [1 ]
机构
[1] Univ Alabama Birmingham, Comp & Informat Sci, Birmingham, AL 35233 USA
关键词
Botnet; Clustering; Computer Forensics; Cybercrime; Data Mining; Spam Image;
D O I
10.4018/jdcf.2009070101
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this article, we propose a spam image clustering approach that uses data mining techniques to study the image attachments of spam emails with the goal to help the investigation of spam clusters or phishing groups. Spam images are first modeled based on their visual features. In particular, the foreground text layout, foreground picture illustrations and background textures are analyzed. After the visual features are extracted from spam images, we use an unsupervised clustering algorithm to group visually similar spam images into clusters. The clustering results are evaluated by visual validation since there is no prior knowledge as to the actual sources of spam images. Our initial results show that the proposed approach is effective in identifying the visual similarity between spam images and thus can provide important indications of the common source of spam images.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 6 条
  • [1] A Heuristic-Based Feature Selection Method for Clustering Spam Emails
    Song, Jungsuk
    Eto, Masashi
    Kim, Hyung Chan
    Inoue, Daisuke
    Nakao, Koji
    NEURAL INFORMATION PROCESSING: THEORY AND ALGORITHMS, PT I, 2010, 6443 : 290 - 297
  • [2] Efficient Clustering of Emails Into Spam and Ham: The Foundational Study of a Comprehensive Unsupervised Framework
    Karim, Asif
    Azam, Sami
    Shanmugam, Bharanidharan
    Kannoorpatti, Krishnan
    IEEE ACCESS, 2020, 8 : 154759 - 154788
  • [3] Hybrid Convolutional Autoencoder-Hierarchical Clustering Algorithm To Reveal Image Spam Sources
    Lu, Yongjin
    Chen, Wei-Bang
    Ailsworth, Zanyan
    Wang, Xiaoliang
    Zhang, Chengcui
    Li, Kaixuan
    2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 28 - 33
  • [4] An Unsupervised Approach for Content-Based Clustering of Emails Into Spam and Ham Through Multiangular Feature Formulation
    Karim, Asif
    Azam, Sami
    Shanmugam, Bharanidharan
    Kannoorpatti, Krishnan
    IEEE ACCESS, 2021, 9 : 135186 - 135209
  • [5] Enhancing Multimodal Clustering Framework with Deep Learning to Reveal Image Spam Authorship
    Chen, Wei-Bang
    Lu, Yongjin
    Ailsworth, Zanyah
    Wang, Xiaoliang
    Zhang, Chengcui
    2021 IEEE 22ND INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2021), 2021, : 193 - 200
  • [6] Clustering image noise patterns by embedding and visualization for common source camera detection
    Georgievska, Sonja
    Bakhshi, Rena
    Gavai, Anand
    Sclocco, Alessio
    van Werkhoven, Ben
    DIGITAL INVESTIGATION, 2017, 23 : 22 - 30