Sequential anomaly detection based on temporal-difference learning: Principles, models and case studies

被引:36
作者
Xu, Xin [1 ]
机构
[1] Natl Univ Def Technol, Inst Automat, Coll Mech & Automat, Changsha 410073, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Anomaly detection; Temporal-difference; Markov reward processes; Learning prediction; Computer security; Reinforcement learning; INTRUSION DETECTION;
D O I
10.1016/j.asoc.2009.10.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anomaly detection is an important problem that has been popularly researched within diverse research areas and application domains. One of the open problems in anomaly detection is the modeling and prediction of complex sequential data, which consist of a series of temporally related behavior patterns. In this paper, a novel sequential anomaly detection method based on temporal-difference (TD) learning is proposed, where the anomaly detection problem of multi-stage cyber attacks is considered as an application case. A Markov reward process model is presented for the anomaly detection and alarming process of sequential data and it is verified that when the reward function is properly defined, the anomaly probabilities of sequential behaviors are equivalent to the value functions of the Markov reward process. Therefore, TD learning algorithms in the reinforcement learning literature can be used to efficiently construct anomaly detection models of complex sequential behaviors by estimating the value functions of the Markov reward process. Compared with other machine learning methods for anomaly detection, the proposed approach has the advantage of simplified labeling process using delayed evaluative signals and the prediction accuracy can be improved even if labeled training data are limited. Based on the experimental results on intrusion detection of host computers using system call data, it was shown that the proposed anomaly detection method can achieve higher or at least comparable detection accuracies than other approaches including SVMs, and HMMs. (C) 2009 Elsevier B. V. All rights reserved.
引用
收藏
页码:859 / 867
页数:9
相关论文
共 28 条
[1]  
[Anonymous], 2002, P 9 ACM C COMP COMM
[2]   Technical update: Least-squares temporal difference learning [J].
Boyan, JA .
MACHINE LEARNING, 2002, 49 (2-3) :233-246
[3]  
CANNADY J, 2000, 23 NAT INF SYST SEC
[4]  
Chawla N. V., 2004, ACM SIGKDD Explorations Newsletter, V6, P1
[5]  
Egan J. P., 1975, SIGNAL DETECTION THE
[6]  
Hofmeyr S. A., 1998, Journal of Computer Security, V6, P151
[7]  
JHA S, 2001, P COMP SEC FDN WORKS
[8]  
Joshi M.V., 2002, KDD 02, P297, DOI DOI 10.1145/775047.775092
[9]   Reinforcement learning: A survey [J].
Kaelbling, LP ;
Littman, ML ;
Moore, AW .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :237-285
[10]  
Kang DK, 2005, LECT NOTES COMPUT SC, V3495, P511