Phishing Website Detection With Semantic Features Based on Machine Learning Classifiers: A Comparative Study

被引:90
作者
Almomani, Ammar [1 ,2 ]
Alauthman, Mohammad [3 ]
Shatnawi, Mohd Taib [4 ]
Alweshah, Mohammed [4 ]
Alrosan, Ayat [5 ]
Alomoush, Waleed [5 ]
Gupta, Brij B. [6 ,7 ]
机构
[1] Skyline Univ Coll, Res & Innovat Dept, Sharjah, U Arab Emirates
[2] Al Balqa Appl Univ, Al Huson Univ Coll, IT Dept, Salt, Jordan
[3] Univ Petra, Amman, Jordan
[4] Al Balqa Appl Univ, Salt, Jordan
[5] Skyline Univ Coll, Sch Informat Technol, Sharjah, U Arab Emirates
[6] Natl Inst Technol Kurukshetra, Dept Comp Engn, Kurukshetra, Haryana, India
[7] Asia Univ, Taichung, Taiwan
关键词
Machine Learning Models; Phishing Website; Semantic Classification; Semantic Features; BOTNET DETECTION; DECISION TREE;
D O I
10.4018/IJSWIS.297032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The phishing attack is one of the main cybersecurity threats in web phishing and spear phishing. Phishing websites continue to be a problem. One of the main contributions to the study was working and extracting the URL and domain identity feature, abnormal features, HTML and JavaScript features, and domain features as semantic features to detect phishing websites, which makes the process of classification using those semantic features more controllable and more effective. The current study used the machine learning model algorithms to detect phishing websites, and comparisons were made. The authors have used 16 machine learning models adopted with 10 semantic features that represent the most effective features for the detection of phishing webpages extracted from two datasets. The GradientBoostingClassifier and RandomForestClassifier had the best accuracy based on the comparison results (i.e., about 97%). In contrast, GaussianNB and the stochastic gradient descent (SGD) classifier represent the lowest accuracy results, 84% and 81% respectively, in comparison with other classifiers.
引用
收藏
页数:24
相关论文
共 68 条
[1]  
Abu-Nimeh S., 2009, 2009 IEEE INT C COMM, DOI [10.1109/ICC.2009.5198931, DOI 10.1109/ICC.2009.5198931]
[2]  
Adebowale M. A., 2019, INTELLIGENT WEB PHIS
[3]  
Al-Momani Ammar Ali Deeb, 2011, Journal of Applied Sciences, V11, P3301, DOI 10.3923/jas.2011.3301.3307
[4]   A Survey of Fast Flux Botnet Detection With Fast Flux Cloud Computing [J].
Al-Nawasrah, Ahmad ;
Almomani, Ammar Ali ;
Atawneh, Samer ;
Alauthman, Mohammad .
INTERNATIONAL JOURNAL OF CLOUD APPLICATIONS AND COMPUTING, 2020, 10 (03) :17-53
[5]   A P2P Botnet detection scheme based on decision tree and adaptive multilayer neural networks [J].
Alauthaman, Mohammad ;
Aslam, Nauman ;
Zhang, Li ;
Alasem, Rafe ;
Hossain, M. A. .
NEURAL COMPUTING & APPLICATIONS, 2018, 29 (11) :991-1004
[6]   An efficient reinforcement learning-based Botnet detection approach [J].
Alauthman, Mohammad ;
Aslam, Nauman ;
Al-kasassbeh, Mouhammd ;
Khan, Suleman ;
Al-Qerem, Ahmad ;
Choo, Kim-Kwang Raymond .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2020, 150
[7]   Phishing Attacks: A Recent Comprehensive Study and a New Anatomy [J].
Alkhalil, Zainab ;
Hewage, Chaminda ;
Nawaf, Liqaa ;
Khan, Imtiaz .
FRONTIERS IN COMPUTER SCIENCE, 2021, 3
[8]  
Almomani Ammar, 2012, Journal of Computer Science, V8, P1099, DOI 10.3844/jcssp.2012.1099.1107
[9]  
Almomani A., 2012, INT J DIGIT CONTENT, V6, P119
[10]  
Almomani A., 2015, INDIAN J SCI TECHNOL, V8, P260