In-Depth Analysis of Phishing Email Detection: Evaluating the Performance of Machine Learning and Deep Learning Models Across Multiple Datasets

被引:0
|
作者
Alhuzali, Abeer [1 ]
Alloqmani, Ahad [1 ]
Aljabri, Manar [1 ]
Alharbi, Fatemah [2 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Dept Comp Sci, Jeddah 21589, Saudi Arabia
[2] Taibah Univ, Coll Comp Sci & Engn, Comp Sci Dept, Yanbu 46522, Saudi Arabia
来源
APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 06期
关键词
phishing email detection; cybersecurity; artificial intelligence (AI); deep learning (DL); machine learning (ML); spam filtering; threat detection; transformer models;
D O I
10.3390/app15063396
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Phishing emails remain a primary vector for cyberattacks, necessitating advanced detection mechanisms. Existing studies often focus on limited datasets or a small number of models, lacking a comprehensive evaluation approach. This study develops a novel framework for implementing and testing phishing email detection models to address this gap. A total of fourteen machine learning (ML) and deep learning (DL) models are evaluated across ten datasets, including nine publicly available datasets and a merged dataset created for this study. The evaluation is conducted using multiple performance metrics to ensure a comprehensive comparison. Experimental results demonstrate that DL models consistently outperform their ML counterparts in both accuracy and robustness. Notably, transformer-based models BERT and RoBERTa achieve the highest detection accuracies of 98.99% and 99.08%, respectively, on the balanced merged dataset, outperforming traditional ML approaches by an average margin of 4.7%. These findings highlight the superiority of DL in phishing detection and emphasize the potential of AI-driven solutions in strengthening email security systems. This study provides a benchmark for future research and sets the stage for advancements in cybersecurity innovation.
引用
收藏
页数:30
相关论文
共 31 条
  • [21] Domain generated algorithms detection applying a combination of a deep feature selection and traditional machine learning models
    Hassaoui, Mohamed
    Hanini, Mohamed
    El Kafhali, Said
    JOURNAL OF COMPUTER SECURITY, 2023, 31 (01) : 85 - 105
  • [22] Machine and Deep Learning Based Comparative Analysis Using Hybrid Approaches for Intrusion Detection System
    Rashid, Azam
    Siddique, Muhammad Jawaid
    Ahmed, Shahid Munir
    2020 3RD INTERNATIONAL CONFERENCE ON ADVANCEMENTS IN COMPUTATIONAL SCIENCES (ICACS), 2020,
  • [23] On the Performance of Machine Learning Models for Anomaly-Based Intelligent Intrusion Detection Systems for the Internet of Things
    Abdelmoumin, Ghada
    Rawat, Danda B.
    Rahman, Abdul
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (06): : 4280 - 4290
  • [24] Early detection of monkeypox: Analysis and optimization of pretrained deep learning models using the Sparrow Search Algorithm
    Bamaqa, Amna
    Bahgat, Waleed M.
    Abdulazeem, Yousry
    Balaha, Hossam Magdy
    Badawy, Mahmoud
    Elhosseini, Mostafa A.
    RESULTS IN ENGINEERING, 2024, 24
  • [25] Comparative performance analysis of Boruta, SHAP, and Borutashap for disease diagnosis: A study with multiple machine learning algorithms
    Ejiyi, Chukwuebuka Joseph
    Qin, Zhen
    Ukwuoma, Chiagoziem Chima
    Nneji, Grace Ugochi
    Monday, Happy Nkanta
    Ejiyi, Makuachukwu Bennedith
    Ejiyi, Thomas Ugochukwu
    Okechukwu, Uchenna
    Bamisile, Olusola O.
    NETWORK-COMPUTATION IN NEURAL SYSTEMS, 2024,
  • [26] Advanced deep learning models for automatic detection of driver’s facial expressions, movements, and alertness in varied lighting conditions: a comparative analysis
    Shiplu Das
    Sanjoy Pratihar
    Buddhadeb Pradhan
    Multimedia Tools and Applications, 2025, 84 (6) : 2947 - 2983
  • [27] Investigating Generalized Performance of Data-Constrained Supervised Machine Learning Models on Novel, Related Samples in Intrusion Detection
    D'hooge, Laurens
    Verkerken, Miel
    Wauters, Tim
    De Turck, Filip
    Volckaert, Bruno
    SENSORS, 2023, 23 (04)
  • [28] Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique
    Raisa Abedin Disha
    Sajjad Waheed
    Cybersecurity, 5
  • [30] Statistical Insights Into Machine Learning Models for Predicting Under-Five Mortality: An Analysis From Multiple Indicator Cluster Survey (MICS)
    Satty, Ali
    Khamis, Gamal Saad Mohamed
    Mohammed, Zakariya M. S.
    Mahmoud, Ashraf F. A.
    Abdalla, Faroug A.
    Salih, Mohyaldein
    Hassaballa, Abaker A.
    Gumma, Elzain A. E.
    IEEE ACCESS, 2025, 13 : 45312 - 45320