Malicious URL Detection based on Machine Learning

被引:0
作者
Cho Do Xuan [1 ,2 ]
Hoa Dinh Nguyen [1 ,2 ]
Nikolaevich, Tisenko Victor [3 ]
机构
[1] Posts & Telecommun Inst Technol, Informat Secur Dept, Hanoi, Vietnam
[2] FPT Univ, Informat Assurance Dept, Hanoi, Vietnam
[3] Peter Great St Petersburg Polytech Univ, Syst Automat Design, Polytech Skaya 29, St Petersburg, Russia
关键词
URL; malicious URL detection; feature extraction; feature selection; machine learning;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Currently, the risk of network information insecurity is increasing rapidly in number and level of danger. The methods mostly used by hackers today is to attack end-to-end technology and exploit human vulnerabilities. These techniques include social engineering, phishing, pharming, etc. One of the steps in conducting these attacks is to deceive users with malicious Uniform Resource Locators (URLs). As a results, malicious URL detection is of great interest nowadays. There have been several scientific studies showing a number of methods to detect malicious URLs based on machine learning and deep learning techniques. In this paper, we propose a malicious URL detection method using machine learning techniques based on our proposed URL behaviors and attributes. Moreover, bigdata technology is also exploited to improve the capability of detection malicious URLs based on abnormal behaviors. In short, the proposed detection system consists of a new set of URLs features and behaviors, a machine learning algorithm, and a bigdata technology. The experimental results show that the proposed URL attributes and behavior can help improve the ability to detect malicious URL significantly. This is suggested that the proposed system may be considered as an optimized and friendly used solution for malicious URL detection.
引用
收藏
页码:148 / 153
页数:6
相关论文
共 15 条
  • [1] [Anonymous], 2012, Revised Selected Papers
  • [2] [Anonymous], 2017, CORR
  • [3] [Anonymous], INT SEC THREAT REP I
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] Canfora G, 2014, LECT NOTES COMPUT SC, V8708, P226, DOI 10.1007/978-3-319-10975-6_17
  • [6] Cova M., 2010, P 19 INT C WORLD WID, P281, DOI [10.1145/1772690.1772724, DOI 10.1145/1772690.1772724]
  • [7] Ensemble methods in machine learning
    Dietterich, TG
    [J]. MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 : 1 - 15
  • [8] A Taxonomy of Attacks and a Survey of Defence Mechanisms for Semantic Social Engineering Attacks
    Heartfield, Ryan
    Loukas, George
    [J]. ACM COMPUTING SURVEYS, 2015, 48 (03)
  • [9] Phishing Detection: A Literature Survey
    Khonji, Mahmoud
    Iraqi, Youssef
    Jones, Andrew
    [J]. IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2013, 15 (04): : 2091 - 2121
  • [10] Opperman TAK, 2008, 2008 IEEE 19TH INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, P53, DOI 10.1109/PIMRC.2008.4699403