Phishing website detection: How effective are deep learning-based models and hyperparameter optimization

被引:16
作者
Almousa, May [1 ,2 ]
Zhang, Tianyang [3 ]
Sarrafzadeh, Abdolhossein [4 ]
Anwar, Mohd [1 ]
机构
[1] North Carolina A&T State Univ, Coll Engn, Comp Sci Dept, Greensboro, NC 27411 USA
[2] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Informat Technol Dept, Riyadh, Saudi Arabia
[3] Univ Massachusetts, Elect & Comp Engn Dept, Coll Engn, Amherst, MA 01003 USA
[4] North Carolina A&T State Univ, Ctr Excellence Cybersecur Res Educ & Outreach CRE, Greensboro, NC USA
关键词
convolutional neural network; deep learning; fully connected deep neural network; genetic algorithms; grid search algorithms; hyperparameter optimization; long short-term memory; phishing website detection; URL; FEATURE-SELECTION;
D O I
10.1002/spy2.256
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Phishing websites are fraudulent websites that appear legitimate and trick unsuspecting users into interacting with them, stealing their valuable information. Because phishing attacks are a leading cause of data breach, different anti-phishing solutions have been explored for cybersecurity management including machine learning-based technical approaches. However, there is a gap in understanding how robust deep learning-based models together with hyperparameter optimization are for phishing website detection. In this vein, this study pursues the tasks of developing parsimonious deep learning models and hyperparameter optimization to achieve high accuracy and reproducible results for phishing website detection. This paper demonstrates a systematic process of building detection models based on three deep learning algorithm architectures (Long Short-Term Memory-based detection models, Fully Connected Deep Neural Network-based detection models, and convolutional neural network-based detection models) that are built and evaluated using four publicly available phishing website datasets, achieving the best accuracy of 97.37%. We also compared two different optimization algorithms for hyperparameter optimization: Grid Search and Genetic Algorithm, which contributed to 0.1%-1% increase in accuracy.
引用
收藏
页数:19
相关论文
共 44 条
[1]  
Abdelhamid N., 2016, WEB PHISH DATA SET
[2]   Phishing detection based Associative Classification data mining [J].
Abdelhamid, Neda ;
Ayesh, Aladdin ;
Thabtah, Fadi .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (13) :5948-5959
[3]   Intelligent phishing detection system for e-banking using fuzzy data mining [J].
Aburrous, Maher ;
Hossain, M. A. ;
Dahal, Keshav ;
Thabtah, Fadi .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) :7913-7921
[4]   Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm-based feature selection and weighting [J].
Ali, Waleed ;
Ahmed, Adel A. .
IET INFORMATION SECURITY, 2019, 13 (06) :659-669
[5]  
ALmomani A., 2013, INDIAN J SCI TECHNOL, V6, DOI 10.17485/ijst/2013/v6i1.18
[6]  
[Anonymous], ALEXA TOP SITES
[7]  
[Anonymous], Openphish-phishing intelligence
[8]  
[Anonymous], Google Safe Browsing for Firefox
[9]  
[Anonymous], Phishing number one cause of data breaches
[10]  
[Anonymous], Aza-dataset