Phishing Webpage Detection via Multi-Modal Integration of HTML']HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks

被引:0
|
作者
Yoon, Jun-Ho [1 ]
Buu, Seok-Jun [1 ]
Kim, Hae-Jung [2 ]
机构
[1] Gyeongsang Natl Univ, Dept Comp Engn, Jinju Si 52828, South Korea
[2] Kyungil Univ, Dept Comp Engn, Gyongsan 38428, South Korea
基金
新加坡国家研究基金会;
关键词
phishing webpage detection; graph convolutional network; transformer network; multi-modal integration; cyberspace security; MODEL;
D O I
10.3390/electronics13163344
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detecting phishing webpages is a critical task in the field of cybersecurity, with significant implications for online safety and data protection. Traditional methods have primarily relied on analyzing URL features, which can be limited in capturing the full context of phishing attacks. In this study, we propose an innovative approach that integrates HTML DOM graph modeling with URL feature analysis using advanced deep learning techniques. The proposed method leverages Graph Convolutional Networks (GCNs) to model the structure of HTML DOM graphs, combined with Convolutional Neural Networks (CNNs) and Transformer Networks to capture the character and word sequence features of URLs, respectively. These multi-modal features are then integrated using a Transformer network, which is adept at selectively capturing the interdependencies and complementary relationships between different feature sets. We evaluated our approach on a real-world dataset comprising URL and HTML DOM graph data collected from 2012 to 2024. This dataset includes over 80 million nodes and edges, providing a robust foundation for testing. Our method demonstrated a significant improvement in performance, achieving a 7.03 percentage point increase in classification accuracy compared to state-of-the-art techniques. Additionally, we conducted ablation tests to further validate the effectiveness of individual features in our model. The results validate the efficacy of integrating HTML DOM structure and URL features using deep learning. Our framework significantly enhances phishing detection capabilities, providing a more accurate and comprehensive solution to identifying malicious webpages.
引用
收藏
页数:21
相关论文
共 5 条
  • [1] Combining Long-Term Recurrent Convolutional and Graph Convolutional Networks to Detect Phishing Sites Using URL and HTML']HTML
    Ariyadasa, Subhash
    Fernando, Shantha
    Fernando, Subha
    IEEE ACCESS, 2022, 10 : 82355 - 82375
  • [2] Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks for Fake News Detection
    Qian, Shengsheng
    Hu, Jun
    Fang, Quan
    Xu, Changsheng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (03)
  • [3] Video–text retrieval via multi-modal masked transformer and adaptive attribute-aware graph convolutional network
    Gang Lv
    Yining Sun
    Fudong Nian
    Multimedia Systems, 2024, 30
  • [4] Video-text retrieval via multi-modal masked transformer and adaptive attribute-aware graph convolutional network
    Lv, Gang
    Sun, Yining
    Nian, Fudong
    MULTIMEDIA SYSTEMS, 2024, 30 (01)
  • [5] Attention-Based Node-Edge Graph Convolutional Networks for Identification of Autism Spectrum Disorder Using Multi-Modal MRI Data
    Chen, Yuzhong
    Yan, Jiadong
    Jiang, Mingxin
    Zhao, Zhongbo
    Zhao, Weihua
    Zhang, Rong
    Kendrick, Keith M.
    Jiang, Xi
    PATTERN RECOGNITION AND COMPUTER VISION,, PT III, 2021, 13021 : 374 - 385