Phishing Web Page Detection Using Optimised Machine Learning

被引:5
作者
Stobbs, Jordan [1 ]
Issac, Biju [1 ]
Jacob, Seibu Mary [2 ]
机构
[1] Northumbria Univ, Comp & Informat Sci, Newcastle Upon Tyne, Tyne & Wear, England
[2] Teesside Univ, Comp Engn & Digital Technol, Middlesbrough, Cleveland, England
来源
2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020) | 2020年
基金
英国工程与自然科学研究理事会;
关键词
Phishing detection; Bio-inspired optimisation; Anti-Phishing; Optimisation;
D O I
10.1109/TrustCom50675.2020.00072
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Phishing is a type of social engineering attack that can affect any company or anyone. This paper explores the effect that different features and optimisation techniques have on the accuracy of intelligent phishing detection using machine learning algorithms. This work looks at both hyperparameter optimisation as well as feature selection optimisation. For hyperparameter tuning, both TPE (Tree-structured Parzen Estimator) and GA (Genetic Algorithm) were tested, with the best option being model dependent. For feature selection, GA, MFO (Moth Flame Optimisation) and PSO (Particle Swarm Optimisation) were used with PSO working best with a Random Forest model. This work used URL (Uniform Resource Locator), DOM (Document Object Model) structure, page rank and page information related features. This research found that the best combination was Random Forest using PSO for feature selection and TPE for hyperparameter optimisation, giving an accuracy of 99.33%.
引用
收藏
页码:483 / 490
页数:8
相关论文
共 22 条
[1]  
Avanan, 2019, EM BEC WEAK LINK
[2]   Towards Developing a Tool to Detect Phishing URLs: A Machine Learning Approach [J].
Basnet, Ram B. ;
Doleck, Tenzin .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION TECHNOLOGY CICT 2015, 2015, :220-223
[3]  
Chapla Happy, 2019, 2019 International Conference on Communication and Electronics Systems (ICCES), P383, DOI 10.1109/ICCES45898.2019.9002145
[4]  
Dong A., 2015, 2015 APWG S ELECT CR, P1, DOI [10.1109/ECRIME.2015.7120795, DOI 10.1109/ECRIME.2015.7120795]
[5]  
Doran D., 2019, 2019 IEEE INT C FUZZ
[6]   Phishing URL Detection via CNN and Attention-Based Hierarchical RNN [J].
Huang, Yongjie ;
Yang, Qiping ;
Qin, Jinghui ;
Wen, Wushao .
2019 18TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS/13TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (TRUSTCOM/BIGDATASE 2019), 2019, :112-119
[7]  
Jain AK, 2016, PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, P2125
[8]  
Jingwen Huang, 2011, Proceedings of the 2011 11th International Conference on Hybrid Intelligent Systems (HIS 2011), P318, DOI 10.1109/HIS.2011.6122125
[9]   PhishStorm: Detecting Phishing With Streaming Analytics [J].
Marchal, Samuel ;
Francois, Jerome ;
State, Radu ;
Engel, Thomas .
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2014, 11 (04) :458-471
[10]  
Maurice C, 2017, 2017 14TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS)