Hunting for Insider Threats Using LSTM-Based Anomaly Detection

被引:21
作者
Villarreal-Vasquez, Miguel [1 ]
Modelo-Howard, Gaspar [2 ]
Dube, Simant [3 ]
Bhargava, Bharat [1 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[2] Palo Alto Networks, Santa Clara, CA 95054 USA
[3] Broadcom Inc, Mountain View, CA 94043 USA
关键词
Anomaly detection; Hidden Markov models; Vocabulary; Computational modeling; Training; Testing; Sequences; endpoint detection and response (EDR); high-dimensional data; insider threats; long short-term memory (LSTM); order-aware recognition (OAR) problem; sequence analysis; variable-length system activity event sequences;
D O I
10.1109/TDSC.2021.3135639
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Insider threats are one of the most difficult problems to solve, given the privileges and information available to insiders to launch different types of attacks. Current security systems can record and analyze sequences from a deluge of log data, potentially becoming a tool to detect insider threats. The issue is that insiders mix the sequence of attack steps with valid actions, reducing the capacity of security systems to programmatically detect the attacks. To address this shortcoming, we introduce LADOHD, an anomaly detection framework based on Long-Short Term Memory (LSTM) models, which learns the expected event patterns in a computer system to identify attack sequences even when attacks span for a long time. The applicability of the framework is demonstrated on a dataset of 38.9 million events collected from a commercial network of 30 computers over twenty days and where a 4-day long insider threat attack occurs. Results show that LADOHD outperforms the anomaly detection system used to protect the commercial network with a True Positive Rate of 97.29% and a False Positive Rate of 0.38%. Experiments also show that LSTMs have higher prediction precision in variable-length sequences than methods like Hidden Markov Models, a crucial requirement in sequence-analysis-based anomaly detection techniques.
引用
收藏
页码:451 / 462
页数:12
相关论文
共 46 条
[1]  
Abu Bakar Z, 2006, CONF CYBERN INTELL S, P360
[2]   A comprehensive survey of numeric and symbolic outlier mining techniques [J].
Agyemang, Malik ;
Barker, Ken ;
Alhajj, Rada .
INTELLIGENT DATA ANALYSIS, 2006, 10 (06) :521-538
[3]  
Ben Salem M, 2008, ADV INFORM SECUR, V39, P69
[4]  
Lipton ZC, 2015, Arxiv, DOI [arXiv:1506.00019, 10.48550/arXiv.1506.00019]
[5]  
Cadez I., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P280, DOI 10.1145/347090.347151
[6]  
Cappelli DM., 2012, The cert guide to insider threats: How to prevent, detect, and respond to information technology crimes (theft, sabotage, fraud)
[7]  
Chen P, 2014, LECT NOTES COMPUT SC, V8735, P63, DOI 10.1007/978-3-662-44885-4_5
[8]  
Cho K., 2014, P EMPIRICAL METHODS, P1724, DOI 10.48550/arXiv.1406.1078
[9]   DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning [J].
Du, Min ;
Li, Feifei ;
Zheng, Guineng ;
Srikumar, Vivek .
CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, :1285-1298
[10]   Multi-level Anomaly Detection in Industrial Control Systems via Package Signatures and LSTM networks [J].
Feng, Cheng ;
Li, Tingting ;
Chana, Deeph .
2017 47TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2017, :261-272