A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts

被引：5

作者：

Wang, Yanbin ^{[1
]}

Ma, Wenrui ^{[1
]}

Xu, Haitao ^{[1
]}

Liu, Yiwei ^{[2
]}

Yin, Peng ^{[2
,3
]}

机构：

[1] Zhejiang Univ, Sch Cyber & Technol, Hangzhou 310027, Peoples R China

[2] Def Ind Secrecy Examinat & Certificat Ctr, Beijing 100089, Peoples R China

[3] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing 101408, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 13期

基金：

中国国家自然科学基金;

关键词：

phishing attack detection; multi-view learning; transformer; self-supervised learning; MALICIOUS URL; MODEL;

D O I：

10.3390/app13137429

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Phishing poses a significant threat to the financial and privacy security of internet users and often serves as the starting point for cyberattacks. Many machine-learning-based methods for detecting phishing websites rely on URL analysis, offering simplicity and efficiency. However, these approaches are not always effective due to the following reasons: (1) highly concealed phishing websites may employ tactics such as masquerading URL addresses to deceive machine learning models, and (2) phishing attackers frequently change their phishing website URLs to evade detection. In this study, we propose a robust, multi-view Transformer model with an expert-mixture mechanism for accurate phishing website detection utilizing website URLs, attributes, content, and behavioral information. Specifically, we first adapted a pretrained language model for URL representation learning by applying adversarial post-training learning in order to extract semantic information from URLs. Next, we captured the attribute, content, and behavioral features of the websites and encoded them as vectors, which, alongside the URL embeddings, constitute the website's multi-view information. Subsequently, we introduced a mixture-of-experts mechanism into the Transformer network to learn knowledge from different views and adaptively fuse information from various views. The proposed method outperforms state-of-the-art approaches in evaluations of real phishing websites, demonstrating greater performance with less label dependency. Furthermore, we show the superior robustness and enhanced adaptability of the proposed method to unseen samples and data drift in more challenging experimental settings.

引用

页数：17

共 49 条

[11] Bengio Yoshua, 2013, Statistical Language and Speech Processing. First International Conference, SLSP 2013. Proceedings: LNCS 7978, P1, DOI 10.1007/978-3-642-39593-2_1
[12] Blum A., 2010, P 3 ACM WORKSH ART I, P54, DOI [10.1145/1866423.1866434, DOI 10.1145/1866423.1866434]
[13] URL-based Web Tracking Detection Using Deep Learning
Castell-Uroz, Ismael
Poissonnier, Theo
Manneback, Pierre
Barlet-Ros, Pere
[J]. 2020 16TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2020,
[14] A Deep Learning Method for Lightweight and Cross-Device IoT Botnet Detection
Catillo, Marta
Pecchia, Antonio
Villano, Umberto
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (02):
[15] Xuan CD, 2020, INT J ADV COMPUT SC, V11, P148
[16] Tracking Phishing Attacks Over Time
Cui, Qian
Jourdan, Guy-Vincent
Bochmann, Gregor, V
Couturier, Russell
Onut, Iosif-Viorel
[J]. PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'17), 2017, : 667 - 676
[17] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
[18] Du CN, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P4019
[19] Mobile phishing attacks and defence mechanisms: State of art and open research challenges
Goel, Diksha
Jain, Ankit Kumar
[J]. COMPUTERS & SECURITY, 2018, 73 : 519 - 544
[20] A Secure Intrusion Detection Platform Using Blockchain and Radial Basis Function Neural Networks for Internet of Drones
Heidari, Arash
Navimipour, Nima Jafari
Unal, Mehmet
[J]. IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (10) : 8445 - 8454

← 1 2 3 4 5 →