Malicious URL Detection based on Machine Learning

被引:0
作者
Cho Do Xuan [1 ,2 ]
Hoa Dinh Nguyen [1 ,2 ]
Nikolaevich, Tisenko Victor [3 ]
机构
[1] Posts & Telecommun Inst Technol, Informat Secur Dept, Hanoi, Vietnam
[2] FPT Univ, Informat Assurance Dept, Hanoi, Vietnam
[3] Peter Great St Petersburg Polytech Univ, Syst Automat Design, Polytech Skaya 29, St Petersburg, Russia
关键词
URL; malicious URL detection; feature extraction; feature selection; machine learning;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Currently, the risk of network information insecurity is increasing rapidly in number and level of danger. The methods mostly used by hackers today is to attack end-to-end technology and exploit human vulnerabilities. These techniques include social engineering, phishing, pharming, etc. One of the steps in conducting these attacks is to deceive users with malicious Uniform Resource Locators (URLs). As a results, malicious URL detection is of great interest nowadays. There have been several scientific studies showing a number of methods to detect malicious URLs based on machine learning and deep learning techniques. In this paper, we propose a malicious URL detection method using machine learning techniques based on our proposed URL behaviors and attributes. Moreover, bigdata technology is also exploited to improve the capability of detection malicious URLs based on abnormal behaviors. In short, the proposed detection system consists of a new set of URLs features and behaviors, a machine learning algorithm, and a bigdata technology. The experimental results show that the proposed URL attributes and behavior can help improve the ability to detect malicious URL significantly. This is suggested that the proposed system may be considered as an optimized and friendly used solution for malicious URL detection.
引用
收藏
页码:148 / 153
页数:6
相关论文
共 15 条
[1]  
[Anonymous], 2010, Proceedings of the 19th International Conference on World Wide Web. WWW'10, DOI [10.1145/1772690.1772720, DOI 10.1145/1772690.1772720]
[2]  
[Anonymous], 2014, THESIS
[3]  
[Anonymous], 2017, CORR
[4]  
[Anonymous], INT SEC THREAT REP I
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]  
Canfora G, 2014, LECT NOTES COMPUT SC, V8708, P226, DOI 10.1007/978-3-319-10975-6_17
[7]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15
[8]  
Eshete Birhanu, 2012, Revised Selected Papers, V8, P149
[9]   A Taxonomy of Attacks and a Survey of Defence Mechanisms for Semantic Social Engineering Attacks [J].
Heartfield, Ryan ;
Loukas, George .
ACM COMPUTING SURVEYS, 2015, 48 (03)
[10]   Phishing Detection: A Literature Survey [J].
Khonji, Mahmoud ;
Iraqi, Youssef ;
Jones, Andrew .
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2013, 15 (04) :2091-2121