AI@ntiPhish - Machine Learning Mechanisms for Cyber-Phishing Attack

被引:18
作者
Chen, Yu-Hung [1 ,3 ]
Chen, Jiann-Liang [2 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Taipei, Taiwan
[2] Natl Taiwan Univ Sci & Technol, Dept Elect Engn, Taipei, Taiwan
[3] TREND MICRO Inc, Tokyo, Japan
关键词
anti-phishing; machine learning algorithm; ensemble learning mechanism; cyber attack;
D O I
10.1587/transinf.2018NTI0001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study proposes a novel machine learning architecture and various learning algorithms to build-in anti-phishing services for avoiding cyber-phishing attack. For the rapid develop of information technology, hackers engage in cyber-phishing attack to steal important personal information, which draws information security concerns. The prevention of phishing website involves in various aspect, for example, user training, public awareness, fraudulent phishing, etc. However, recent phishing research has mainly focused on preventing fraudulent phishing and relied on manual identification that is inefficient for real-time detection systems. In this study, we used methods such as ANOVA, X-2, and information gain to evaluate features. Then, we filtered out the unrelated features and obtained the top 28 most related features as the features to use for the training and evaluation of traditional machine learning algorithms, such as Support Vector Machine (SVM) with linear or rbf kernels, Logistic Regression (LR), Decision tree, and K-Nearest Neighbor (KNN). This research also evaluated the above algorithms with the ensemble learning concept by combining multiple classifiers, such as Adaboost, bagging, and voting. Finally, the eXtreme Gradient Boosting (XGBoost) model exhibited the best performance of 99.2%, among the algorithms considered in this study.
引用
收藏
页码:878 / 887
页数:10
相关论文
共 34 条
[1]   PhishZoo: Detecting Phishing Websites By Looking at Them [J].
Afroz, Sadia ;
Greenstadt, Rachel .
FIFTH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2011), 2011, :368-375
[2]  
[Anonymous], UN GLOB RESP CYB
[3]  
[Anonymous], PROCEEDINGS OF THE I
[4]  
[Anonymous], 2014, P 31 INT C MACH LEAR
[5]  
[Anonymous], 2017, IEEE STANDARD 1732 2
[6]  
[Anonymous], P 2010 INT C COMP AP
[7]  
[Anonymous], 2015, P 5 ACM CODASPY NEW
[8]   Spammer Classification using Ensemble Methods over Structural Social Network Features [J].
Bhat, Sajid Yousuf ;
Abulaish, Muhammad ;
Mirza, Abdulrahman A. .
2014 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 2, 2014, :454-458
[9]  
Chang C. C., 2011, ACM T INTEL SYST TEC, V2, P1, DOI DOI 10.1145/1961189.1961199
[10]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794