Enhancing Phishing Detection Through Ensemble Learning and Cross-Validation

被引:0
作者
Jawad, Samer Kadhim [1 ]
Alnajjar, Satea Hikmat [2 ]
机构
[1] Al Iraqia Univ, Comp Engn, Baghdad, Iraq
[2] Al Iraqia Univ, Network Engn, Baghdad, Iraq
来源
2024 INTERNATIONAL CONFERENCE ON SMART APPLICATIONS, COMMUNICATIONS AND NETWORKING, SMARTNETS-2024 | 2024年
关键词
Phishing; Machine learning; Ensemble learning; Gradient Boosting Classifier; cross-validation;
D O I
10.1109/SMARTNETS61466.2024.10577746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phishing is among the most worrying issues in a constantly changing world. Because of the rise in Internet usage, phishing has become a new type of data theft This type of cybercrime refers to the theft of private information and violation of privacy by focusing on human vulnerabilities and technical smuggling. URL phishing (Uniform Resource Locators) is one of the most common types. Detecting a malicious URL is a big challenge. This study concentrates on the enhancement of the phishing detection procedure through the utilization of ensemble learning approaches, notably the Gradient Boosting Classifier, CatBoost, and XGBoost algorithms. Leveraging a comprehensive dataset containing examples of both phishing sites and legitimate sites, the study includes comprehensive exploratory data analysis, rigorous data pre-processing, and rigorous model evaluation using cross-validation. The research extends to include importance analysis, using permutation techniques to reveal critical factors that influence the decision-making processes of models. The results demonstrate the effectiveness of ensemble learning in distinguishing between phishing and legitimate entities, The accuracy results reached 98.14% using Gradient Boosting Classifier and cross-validation technique. while providing valuable insights into the key features that lead to accurate predictions. This research advances the subject of cybersecurity by offering a comprehensive comprehension of crowd learning techniques and their useful applications in fortifying defenses against phishing attempts.
引用
收藏
页数:7
相关论文
共 22 条
[1]  
Abu Al-Haija Qasem, 2021, 2021 International Conference on Data Analytics for Business and Industry (ICDABI), P644, DOI 10.1109/ICDABI53623.2021.9655851
[2]   Phishing URL detection using machine learning methods [J].
Ahammad, S. K. Hasane ;
Kale, Sunil D. ;
Upadhye, Gopal D. ;
Pande, Sandeep Dwarkanath ;
Babu, E. Venkatesh ;
Dhumane, Amol, V ;
Bahadur, Dilip Kumar Jang .
ADVANCES IN ENGINEERING SOFTWARE, 2022, 173
[3]  
Ali Z A., 2023, Acad. J. Nawroz Univ, V12, P320, DOI DOI 10.25007/AJNU.V12N2A1612
[4]  
Atari Mahmoud, 2022, 2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA), P82, DOI 10.1109/IDSTA55301.2022.9923050
[5]   Cybersecurity Awareness Enhancement: A Study of the Effects of Age and Gender of Thai Employees Associated with Phishing Attacks [J].
Daengsi, Therdpong ;
Pornpongtechavanich, Phisit ;
Wuttidittachotti, Pongpisit .
EDUCATION AND INFORMATION TECHNOLOGIES, 2022, 27 (04) :4729-4752
[6]  
Divakaran DM, 2022, arXiv
[7]  
Ghareeb Shatha, 2023, 2023 15th International Conference on Developments in eSystems Engineering (DeSE), P178, DOI 10.1109/DeSE58274.2023.10099697
[8]   Social Engineering and the Dangers of Phishing [J].
Gomes, Vanessa ;
Reis, Joaquim ;
Alturas, Braulio .
2020 15TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI'2020), 2020,
[9]   CatBoost for big data: an interdisciplinary review [J].
Hancock, John T. ;
Khoshgoftaar, Taghi M. .
JOURNAL OF BIG DATA, 2020, 7 (01)
[10]  
Hossain Fahima, 2021, 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), P567, DOI 10.1109/ICREST51555.2021.9331094