A high-accuracy phishing website detection method based on machine learning

被引:9
|
作者
Bahaghighat, Mahdi [1 ]
Ghasemi, Majid [1 ]
Ozen, Figen [2 ]
机构
[1] Imam Khomeini Int Univ, Dept Comp Engn, Qazvin, Iran
[2] Halic Univ, Istanbul, Turkiye
关键词
Phishing website detection; Cyber security; Machine learning; Classification; XGBoost;
D O I
10.1016/j.jisa.2023.103553
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid development of e-commerce, e-banking, and social networks has made phishing attack detection one of the most critical technologies in all cyber security systems. To improve the efficiency of anti-phishing techniques, we present an improved predictive model based on machine learning. The proposed method uses six different algorithms; Logistic Regression, K-Nearest Neighbors, Naive Bayes, Random Forest, Support Vector Machine, and Extreme Gradient Boosting (XGBoost). The experiments are based on a public dataset of 58,000 legitimate websites and 30,647 phishing ones, including 112 attributes for each sample. Our evaluations in the feature selection process show that after balancing the dataset and dropping constant features, a noticeable improvement can be achieved. We conducted our evaluation found on eight major unique scenarios. The experimental results of our phishing websites detection (PWD) method indicate remarkable performances in which each algorithm reached an accuracy of more than 93%, and the XGBoost classifier outperforms others with 99.2% overall accuracy, 99.1% precision, 99.4% recall, and 99.1% specificity. In addition, the study achieved optimal run-time of about 1500 ms for the XGBoost algorithm without dimension reduction while using Principal Component Analysis (PCA) reduces it down to just 869 ms. As a result, the proposed approach would be practical in both offline and real-time applications.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Phishing Attacks Detection A Machine Learning-Based Approach
    Salahdine, Fatima
    El Mrabet, Zakaria
    Kaabouch, Naima
    2021 IEEE 12TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2021, : 250 - 255
  • [42] Phishing detection based on machine learning and feature selection methods
    Almseidin M.
    Abu Zuraiq A.M.
    Al-kasassbeh M.
    Alnidami N.
    International Journal of Interactive Mobile Technologies, 2019, 13 (12) : 71 - 183
  • [43] A case study on phishing detection with a machine learning net
    Bezerra, Ana
    Pereira, Ivo
    Rebelo, Miguel Angelo
    Coelho, Duarte
    de Oliveira, Daniel Alves
    Costa, Joaquim F. Pinto
    Cruz, Ricardo P. M.
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
  • [44] A Study on Adversarial Sample Resistance and Defense Mechanism for Multimodal Learning-Based Phishing Website Detection
    Duy, Phan The
    Minh, Vo Quang
    Dang, Bui Tan Hai
    Son, Ngo Duc Hoang
    Quyen, Nguyen Huu
    Pham, Van-Hau
    IEEE ACCESS, 2024, 12 : 137805 - 137824
  • [45] Feature Selections for the Machine Learning based Detection of Phishing Websites
    Buber, Ebubekir
    Demir, Onder
    Sahingoz, Ozgur Koray
    2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [46] Highly accurate phishing URL detection based on machine learning
    Jalil S.
    Usman M.
    Fong A.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (07) : 9233 - 9251
  • [47] WikiPhish: A Diverse Wikipedia-Based Dataset for Phishing Website Detection
    Loiseau, Gabriel
    Lefils, Valentin
    Meyer, Maxime
    Riquet, Damien
    PROCEEDINGS OF THE FOURTEENTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, CODASPY 2024, 2024, : 361 - 366
  • [48] INTELLIGENT TREE-BASED ENSEMBLE APPROACHES FOR PHISHING WEBSITE DETECTION
    Alsariera, Yazan A.
    Balogun, Abdullateef O.
    Adeyemo, Victor E.
    Tarawneh, Omar H.
    Mojeed, Hammed A.
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2022, 17 (01): : 563 - 582
  • [49] Novel Optimization-Driven Feature Selection Approach for Enhancing Phishing Website Detection Accuracy
    Saeed, Muslim Mousa
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (06) : 1408 - 1417
  • [50] A High-Accuracy Deep Learning Approach for Wheat Disease Detection
    Patil, Soham Lalit
    SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 1, SMARTCOM 2024, 2024, 945 : 277 - 291