Enhancing Phishing Detection Through Ensemble Learning and Cross-Validation

被引:0
|
作者
Jawad, Samer Kadhim [1 ]
Alnajjar, Satea Hikmat [2 ]
机构
[1] Al Iraqia Univ, Comp Engn, Baghdad, Iraq
[2] Al Iraqia Univ, Network Engn, Baghdad, Iraq
来源
2024 INTERNATIONAL CONFERENCE ON SMART APPLICATIONS, COMMUNICATIONS AND NETWORKING, SMARTNETS-2024 | 2024年
关键词
Phishing; Machine learning; Ensemble learning; Gradient Boosting Classifier; cross-validation;
D O I
10.1109/SMARTNETS61466.2024.10577746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phishing is among the most worrying issues in a constantly changing world. Because of the rise in Internet usage, phishing has become a new type of data theft This type of cybercrime refers to the theft of private information and violation of privacy by focusing on human vulnerabilities and technical smuggling. URL phishing (Uniform Resource Locators) is one of the most common types. Detecting a malicious URL is a big challenge. This study concentrates on the enhancement of the phishing detection procedure through the utilization of ensemble learning approaches, notably the Gradient Boosting Classifier, CatBoost, and XGBoost algorithms. Leveraging a comprehensive dataset containing examples of both phishing sites and legitimate sites, the study includes comprehensive exploratory data analysis, rigorous data pre-processing, and rigorous model evaluation using cross-validation. The research extends to include importance analysis, using permutation techniques to reveal critical factors that influence the decision-making processes of models. The results demonstrate the effectiveness of ensemble learning in distinguishing between phishing and legitimate entities, The accuracy results reached 98.14% using Gradient Boosting Classifier and cross-validation technique. while providing valuable insights into the key features that lead to accurate predictions. This research advances the subject of cybersecurity by offering a comprehensive comprehension of crowd learning techniques and their useful applications in fortifying defenses against phishing attempts.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Purposeful cross-validation: a novel cross-validation strategy for improved surrogate optimizability
    Correia, Daniel
    Wilke, Daniel N.
    ENGINEERING OPTIMIZATION, 2021, 53 (09) : 1558 - 1573
  • [42] Cross-Validation Without Doing Cross-Validation in Genome-Enabled Prediction
    Gianola, Daniel
    Schoen, Chris-Carolin
    G3-GENES GENOMES GENETICS, 2016, 6 (10): : 3107 - 3128
  • [43] Cross-validation is dead. Long live cross-validation! Model validation based on resampling
    Knut Baumann
    Journal of Cheminformatics, 2 (Suppl 1)
  • [44] ENHANCING NETWORK META-ANALYSIS THROUGH PREDICTIVE CROSS-VALIDATION: ASSESSING MODEL PERFORMANCE AND DETECTING OUTLIERS
    Sharma, A.
    Tripathi, N.
    Singh, B.
    Pandey, S.
    VALUE IN HEALTH, 2024, 27 (12)
  • [45] Validation and Cross-Validation Methods for ASCAT
    Anderson, Craig
    Figa-Saldana, Julia
    Wilson, John Julian William
    Ticconi, Francesca
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (05) : 2232 - 2239
  • [46] An Optimized Bagging Learning with Ensemble Feature Selection Method for URL Phishing Detection
    Ponni Ponnusamy
    Prabha Dhandayudam
    Journal of Electrical Engineering & Technology, 2024, 19 : 1881 - 1889
  • [47] THE MODIFIED WORD-LEARNING TEST - A CROSS-VALIDATION STUDY
    WALTON, D
    WHITE, JG
    BLACK, DA
    YOUNG, AJ
    BRITISH JOURNAL OF MEDICAL PSYCHOLOGY, 1959, 32 (03): : 213 - 220
  • [48] On Learning and Cross-Validation with Decomposed Nystrom Approximation of Kernel Matrix
    Airola, Antti
    Pahikkala, Tapio
    Salakoski, Tapio
    NEURAL PROCESSING LETTERS, 2011, 33 (01) : 17 - 30
  • [49] SUITOR: Selecting the number of mutational signatures through cross-validation
    Lee, Donghyuk
    Wang, Difei
    Yang, Xiaohong R.
    Shi, Jianxin
    Landi, Maria Teresa
    Zhu, Bin
    PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (04)
  • [50] A cross-validation scheme for machine learning algorithms in shotgun proteomics
    Viktor Granholm
    William Stafford Noble
    Lukas Käll
    BMC Bioinformatics, 13