Enhancing Malicious URL Detection: A Novel Framework Leveraging Priority Coefficient and Feature Evaluation

被引:2
作者
Rafsanjani, Ahmad Sahban [1 ]
Binti Kamaruddin, Norshaliza [2 ]
Behjati, Mehran [1 ]
Aslam, Saad [1 ]
Sarfaraz, Aaliya [1 ]
Amphawan, Angela [1 ,3 ]
机构
[1] Sunway Univ, Sch Engn & Technol, Bandar Sunway 47500, Selangor Darul, Malaysia
[2] Univ Teknol Malaysia, Fac Artificial Intelligence, Kuala Lumpur 54100, Malaysia
[3] Sunway Univ, Sch Engn & Technol, Smart Photon Res Lab, Subang Jaya 47500, Selangor, Malaysia
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Malicious URL detection; phishing; malware; network security; feature extraction; cyber threats; machine learning; NETWORK;
D O I
10.1109/ACCESS.2024.3412331
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Malicious Uniform Resource Locators (URLs) pose a significant cybersecurity threat by carrying out attacks such as phishing and malware propagation. Conventional malicious URL detection methods, relying on blacklists and heuristics, often struggle to identify new and obfuscated malicious URLs. To address this challenge, machine learning and deep learning have been leveraged to enhance detection capabilities, albeit relying heavily on large and frequently updated datasets. Furthermore, the efficacy of these methods is intrinsically tied to the quality of the training data, a requirement that becomes increasingly challenging to fulfill in real-world scenarios due to constraints such as data scarcity and the dynamic nature of evolving cyber threats. In this study, we introduce an innovative framework for malicious URL detection based on predefined static feature classification by allocating priority coefficients and feature evaluation methods. Our feature classification encompasses 42 classes, including blacklist, lexical, host-based, and content-based features. To validate our framework, we collected a dataset of 5000 real-world URLs from prominent phishing and malware websites, namely URLhaus and PhishTank. We assessed our framework's performance using three supervised machine learning methods: Support Vector Machine (SVM), Random Forest (RF), and Bayesian Network (BN). The results demonstrate that our framework outperforms these methods, achieving an impressive detection accuracy of 98.95% and a precision value of 98.60%. Furthermore, we conducted a benchmarking analysis against three comprehensive malicious URL detection methods (PDRCNN, the Li method, and URLNet), demonstrating that our proposed framework excels in terms of accuracy and precision. In conclusion, our novel malicious URL detection framework substantially enhances accuracy, significantly bolstering cybersecurity defenses against emerging threats.
引用
收藏
页码:85001 / 85026
页数:26
相关论文
共 101 条
  • [1] A Saleem Raja, 2022, 2022 6th International Conference on Trends in Electronics and Informatics (ICOEI), P778, DOI 10.1109/ICOEI53556.2022.9777221
  • [2] URLdeepDetect: A Deep Learning Approach for Detecting Malicious URLs Using Semantic Vector Models
    Afzal, Sara
    Asim, Muhammad
    Javed, Abdul Rehman
    Beg, Mirza Omer
    Baker, Thar
    [J]. JOURNAL OF NETWORK AND SYSTEMS MANAGEMENT, 2021, 29 (03)
  • [3] Improved Blacklisting: Inspecting the Structural Neighborhood of Malicious URLs
    Akiyama, Mitsuaki
    Yagi, Takeshi
    Hariu, Takeo
    [J]. IT PROFESSIONAL, 2013, 15 (04) : 50 - 56
  • [4] Akiyama T., 2011, P IEEE IPSJ INT S AP, P1
  • [5] Al-Janabi M., 2017, IEEEACM INT C ADV SO, P1104, DOI [10.1145/3110025.3116201, DOI 10.1145/3110025.3116201]
  • [6] A Convolutional Neural Network Model to Detect Illegitimate URLs
    Al-Milli, Nabeel
    Hammo, Bassam H.
    [J]. 2020 11TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2020, : 220 - 225
  • [7] Alfouzan N. A., 2022, P 2 INT C COMP INF T, P325, DOI 10.1109/ICCIT52419.2022.9711614
  • [8] Detecting Malicious URLs Using Machine Learning Techniques: Review and Research Directions
    Aljabri, Malak
    Altamimi, Hanan S.
    Albelali, Shahd A.
    Al-Harbi, Maimunah
    Alhuraib, Haya T.
    Alotaibi, Najd K.
    Alahmadi, Amal A.
    Alhaidari, Fahd
    Mohammad, Rami Mustafa A.
    Salah, Khaled
    [J]. IEEE ACCESS, 2022, 10 : 121395 - 121417
  • [9] An Effective Phishing Detection Model Based on Character Level Convolutional Neural Network from URL
    Aljofey, Ali
    Jiang, Qingshan
    Qu, Qiang
    Huang, Mingqing
    Niyigena, Jean-Pierre
    [J]. ELECTRONICS, 2020, 9 (09) : 1 - 24
  • [10] Almeida C., 2020, P IEEE INT C INT SEC, P1