A machine learning framework for investigating data breaches based on semantic analysis of adversary's attack patterns in threat intelligence repositories

被引:36
作者
Noor, Umara [1 ,5 ]
Anwar, Zahid [2 ,4 ]
Malik, Asad Waqar [3 ]
Khan, Sharifullah [3 ]
Saleem, Shahzad [3 ]
机构
[1] NUST, Informat Technol, Islamabad, Pakistan
[2] NUST, Islamabad, Pakistan
[3] NUST, Sch Elect Engn & Comp Sci, Islamabad, Pakistan
[4] Fontbonne Univ, Math & Comp Sci, St Louis, MO 63105 USA
[5] Int Islamic Univ, Fac Basic & Appl Sci, Dept Comp Sci & Software Engn, Islamabad, Pakistan
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2019年 / 95卷
关键词
Cyber threat intelligence; Data breach investigation; Tactics Techniques and Procedures; Indicators of compromise; Belief network; Latent Semantic Indexing; CYBER; SECURITY;
D O I
10.1016/j.future.2019.01.022
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the ever increasing cases of cyber data breaches, the manual process of sifting through tons of security logs to investigate cyber-attacks is error-prone and time-consuming. Signature-based deep search solutions only give accurate results if the threat artifacts are precisely provided. With the burgeoning variety of sophisticated cyber threats having common attack patterns and utilizing the same attack tools, a timely investigation is nearly impossible. There is a need to automate the threat analysis process by mapping adversary's Tactics, Techniques and Procedures (TTPs) to attack goals and detection mechanisms. In this paper, a novel machine learning based framework is proposed that identifies cyber threats based on observed attack patterns. The framework semantically relates threats and TTPs extracted from wellknown threat sources with associated detection mechanisms to form a semantic network. This network is then used to determine threat occurrences by forming probabilistic relationships between threats and TTPs. The framework is trained using a TTP taxonomy dataset and the performance is evaluated with threat artifacts reported in threat reports. The framework efficiently identifies attacks with 92% accuracy and low false positives even in the case of lost and spurious TTPs. The average detection time of a data breach incident is 0.15 s for a network trained with 133 TTPs from 45 threat families. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:467 / 487
页数:21
相关论文
共 67 条
[31]   Circumventing iOS security mechanisms for APT forensic investigations: A security taxonomy for cloud apps [J].
D'Orazio, Christian J. ;
Choo, Kim-Kwang Raymond .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 79 :247-261
[32]  
Franklin L, 2017, S VISUALIZATION CYBE, P1
[33]   Cyber security information exchange to gain insight into the effects of cyber threats and incidents [J].
Fransen, F. ;
Smulders, A. ;
Kerkdijk, R. .
ELEKTROTECHNIK UND INFORMATIONSTECHNIK, 2015, 132 (02) :106-112
[34]   Detection of advanced persistent threat using machine-learning correlation analysis [J].
Ghafir, Ibrahim ;
Hammoudeh, Mohammad ;
Prenosil, Vaclav ;
Han, Liangxiu ;
Hegarty, Robert ;
Rabie, Khaled ;
Aparicio-Navarro, Francisco J. .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 89 :349-359
[35]   Dynamic risk management response system to handle cyber threats [J].
Gonzalez-Granadillo, G. ;
Dubus, S. ;
Motzek, A. ;
Garcia-Alfaro, J. ;
Alvarez, E. ;
Merialdo, M. ;
Papillon, S. ;
Debar, H. .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 83 :535-552
[36]  
Han J, 2012, MOR KAUF D, P1
[37]   Hybrids of support vector machine wrapper and filter based framework for malware detection [J].
Huda, Shamsul ;
Abawajy, Jemal ;
Alazab, Mamoun ;
Abdollalihian, Mali ;
Islam, Rafiqul ;
Yearwood, John .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 55 :376-390
[38]  
Hutchins E, 2011, PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INFORMATION WARFARE AND SECURITY, P113
[39]  
Jeffreys H., 1974, Scientific inference, VThird
[40]   Extracting Cybersecurity Related Linked Data from Text [J].
Joshi, Arnav ;
Lal, Ravendar ;
Finin, Tim ;
Joshi, Anupam .
2013 IEEE SEVENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2013), 2013, :252-259