In-Depth Analysis of Phishing Email Detection: Evaluating the Performance of Machine Learning and Deep Learning Models Across Multiple Datasets

被引：0

作者：

Alhuzali, Abeer ^{[1
]}

Alloqmani, Ahad ^{[1
]}

Aljabri, Manar ^{[1
]}

Alharbi, Fatemah ^{[2
]}

机构：

[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Dept Comp Sci, Jeddah 21589, Saudi Arabia

[2] Taibah Univ, Coll Comp Sci & Engn, Comp Sci Dept, Yanbu 46522, Saudi Arabia

来源：

APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 06期

关键词：

phishing email detection; cybersecurity; artificial intelligence (AI); deep learning (DL); machine learning (ML); spam filtering; threat detection; transformer models;

D O I：

10.3390/app15063396

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Phishing emails remain a primary vector for cyberattacks, necessitating advanced detection mechanisms. Existing studies often focus on limited datasets or a small number of models, lacking a comprehensive evaluation approach. This study develops a novel framework for implementing and testing phishing email detection models to address this gap. A total of fourteen machine learning (ML) and deep learning (DL) models are evaluated across ten datasets, including nine publicly available datasets and a merged dataset created for this study. The evaluation is conducted using multiple performance metrics to ensure a comprehensive comparison. Experimental results demonstrate that DL models consistently outperform their ML counterparts in both accuracy and robustness. Notably, transformer-based models BERT and RoBERTa achieve the highest detection accuracies of 98.99% and 99.08%, respectively, on the balanced merged dataset, outperforming traditional ML approaches by an average margin of 4.7%. These findings highlight the superiority of DL in phishing detection and emphasize the potential of AI-driven solutions in strengthening email security systems. This study provides a benchmark for future research and sets the stage for advancements in cybersecurity innovation.

引用

页数：30

共 31 条

[21] Domain generated algorithms detection applying a combination of a deep feature selection and traditional machine learning models
Hassaoui, Mohamed
Hanini, Mohamed
El Kafhali, Said
JOURNAL OF COMPUTER SECURITY, 2023, 31 (01) : 85 - 105
[22] Machine and Deep Learning Based Comparative Analysis Using Hybrid Approaches for Intrusion Detection System
Rashid, Azam
Siddique, Muhammad Jawaid
Ahmed, Shahid Munir
2020 3RD INTERNATIONAL CONFERENCE ON ADVANCEMENTS IN COMPUTATIONAL SCIENCES (ICACS), 2020,
[23] On the Performance of Machine Learning Models for Anomaly-Based Intelligent Intrusion Detection Systems for the Internet of Things
Abdelmoumin, Ghada
Rawat, Danda B.
Rahman, Abdul
IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (06): : 4280 - 4290
[24] Early detection of monkeypox: Analysis and optimization of pretrained deep learning models using the Sparrow Search Algorithm
Bamaqa, Amna
Bahgat, Waleed M.
Abdulazeem, Yousry
Balaha, Hossam Magdy
Badawy, Mahmoud
Elhosseini, Mostafa A.
RESULTS IN ENGINEERING, 2024, 24
[25] Comparative performance analysis of Boruta, SHAP, and Borutashap for disease diagnosis: A study with multiple machine learning algorithms
Ejiyi, Chukwuebuka Joseph
Qin, Zhen
Ukwuoma, Chiagoziem Chima
Nneji, Grace Ugochi
Monday, Happy Nkanta
Ejiyi, Makuachukwu Bennedith
Ejiyi, Thomas Ugochukwu
Okechukwu, Uchenna
Bamisile, Olusola O.
NETWORK-COMPUTATION IN NEURAL SYSTEMS, 2024,
[26] Advanced deep learning models for automatic detection of driver’s facial expressions, movements, and alertness in varied lighting conditions: a comparative analysis
Shiplu Das
Sanjoy Pratihar
Buddhadeb Pradhan
Multimedia Tools and Applications, 2025, 84 (6) : 2947 - 2983
[27] Investigating Generalized Performance of Data-Constrained Supervised Machine Learning Models on Novel, Related Samples in Intrusion Detection
D'hooge, Laurens
Verkerken, Miel
Wauters, Tim
De Turck, Filip
Volckaert, Bruno
SENSORS, 2023, 23 (04)
[28] Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique
Raisa Abedin Disha
Sajjad Waheed
Cybersecurity, 5
[29] Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique
Disha, Raisa Abedin
Waheed, Sajjad
CYBERSECURITY, 2022, 5 (01)
[30] Statistical Insights Into Machine Learning Models for Predicting Under-Five Mortality: An Analysis From Multiple Indicator Cluster Survey (MICS)
Satty, Ali
Khamis, Gamal Saad Mohamed
Mohammed, Zakariya M. S.
Mahmoud, Ashraf F. A.
Abdalla, Faroug A.
Salih, Mohyaldein
Hassaballa, Abaker A.
Gumma, Elzain A. E.
IEEE ACCESS, 2025, 13 : 45312 - 45320

← 1 2 3 4 →