The Effect of Feature Selection on Phish Website Detection An Empirical Study on Robust Feature Subset Selection for Effective Classification

被引:0
作者
Zuhair, Hiba [1 ,2 ]
Selmat, Ali [3 ,4 ]
Salleh, Mazleena [1 ]
机构
[1] Univ Teknol Malaysia, Fac Comp, Dept Comp Sci, Johor Baharu 81310, Johor, Malaysia
[2] Al Nahrain Univ, Baghdad, Iraq
[3] Univ Teknol Malaysia, UTM IRDA Ctr Excellence, Johor Baharu 81310, Johor, Malaysia
[4] Univ Teknol Malaysia, Fac Comp, Johor Baharu 81310, Johor, Malaysia
关键词
phish website; phishing detection; feature selection; classification model;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recently, limited anti-phishing campaigns have given phishers more possibilities to bypass through their advanced deceptions. Moreover, failure to devise appropriate classification techniques to effectively identify these deceptions has degraded the detection of phishing websites. Consequently, exploiting as new; few; predictive; and effective features as possible has emerged as a key challenge to keep the detection resilient. Thus, some prior works had been carried out to investigate and apply certain selected methods to develop their own classification techniques. However, no study had generally agreed on which feature selection method that could be employed as the best assistant to enhance the classification performance. Hence, this study empirically examined these methods and their effects on classification performance. Furthermore, it recommends some promoting criteria to assess their outcomes and offers contribution on the problem at hand. Hybrid features, low and high dimensional datasets, different feature selection methods, and classification models were examined in this study. As a result, the findings displayed notably improved detection precision with low latency, as well as noteworthy gains in robustness and prediction susceptibilities. Although selecting an ideal feature subset was a challenging task, the findings retrieved from this study had provided the most advantageous feature subset as possible for robust selection and effective classification in the phishing detection domain.
引用
收藏
页码:221 / 232
页数:12
相关论文
共 42 条
[1]   A Survey of Phishing Email Filtering Techniques [J].
Almomani, Ammar ;
Gupta, B. B. ;
Atawneh, Samer ;
Meulenberg, A. ;
Almomani, Eman .
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2013, 15 (04) :2070-2090
[2]   Intelligent phishing detection and protection scheme for online transactions [J].
Barraclough, P. A. ;
Hossain, M. A. ;
Tahir, M. A. ;
Sexton, G. ;
Aslam, N. .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (11) :4697-4706
[3]  
Basnet Ram B., 2012, Advanced Research in Applied Artificial Intelligence. Proceedings 25th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2012, P252, DOI 10.1007/978-3-642-31087-4_27
[4]  
Ben-Hur A, 2010, METHODS MOL BIOL, V609, P223, DOI 10.1007/978-1-60327-241-4_13
[5]   New filtering approaches for phishing email [J].
Bergholz, Andre ;
De Beer, Jan ;
Glahn, Sebastian ;
Moens, Marie-Francine ;
Paass, Gerhard ;
Strobel, Siehyun .
JOURNAL OF COMPUTER SECURITY, 2010, 18 (01) :7-35
[6]  
Bhati M., 2012, INT J ENG TECHNOLOGY
[7]   Two novel feature selection approaches for web page classification [J].
Chen, Chih-Ming ;
Lee, Hahn-Ming ;
Chang, Yu-Jung .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (01) :260-272
[8]  
Chen Y, 2006, LECT NOTES COMPUT SC, V4318, P153
[9]   Similarity of feature selection methods: An empirical study across data intensive classification tasks [J].
Dessi, Nicoletta ;
Pes, Barbara .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (10) :4632-4642
[10]   Toward an efficient and scalable feature selection approach for internet traffic classification [J].
Fahad, Adil ;
Tari, Zahir ;
Khalil, Ibrahim ;
Habib, Ibrahim ;
Alnuweiri, Hussein .
COMPUTER NETWORKS, 2013, 57 (09) :2040-2057