Group anomaly detection based on Bayesian framework with genetic algorithm

被引:13
作者
Song, Wanjuan [1 ,2 ,3 ]
Dong, Wenyong [1 ,4 ]
Kang, Lanlan [5 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430070, Peoples R China
[2] Hubei Univ Educ, Coll Comp, Wuhan 430205, Peoples R China
[3] Hubei Educ Cloud Serv Engn Technol Res Ctr, Wuhan 430205, Peoples R China
[4] Nanyang Inst Technol, Sch Software, Nanyang 473004, Peoples R China
[5] Jiangxi Univ Sci & Technol, Coll Appl Sci, Ganzhou 341000, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Group correlation; Genetic algorithm; Anomaly group detection; Logistic normal distribution; Variational inference; OUTLIER DETECTION;
D O I
10.1016/j.ins.2020.03.110
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Anomaly detection is an important application field of evolutionary algorithm. Unlike traditionly anomaly detection, group anomaly detection aims to discover the anomalous aggregate behaviors in data points. Over past decades, a large number of promising methods have been successfully applied for group anomaly detection. However, they inherently neglect the correlations among groups in data points, limiting their abilities. This paper presents a correlated hierarchical generative model, which can model the intricate correlations hidden in groups by introducing a logistic normal distribution to capture the correlations among groups. With the proposed model, we construct a full variational Bayesian framework, which can data-adaptively optimize the model parameters of the proposed model. The model is designed and trained using Genetic Algorithm (GA), which helps automating the use of generative model. Further, a new score function is proposed as an anomaly criterion to estimate final anomaly groups in data points. Several experiments on synthetic data and real astronomical star data from Sloan Digital Sky Survey demonstrate the effectiveness of proposed method compared with the-state-of-art methods, in terms of average accurac (AP) and area under the Receiver Operating Characteristic(ROC) curve(AUC). (C) 2020 Published by Elsevier Inc.
引用
收藏
页码:138 / 149
页数:12
相关论文
共 42 条
[1]  
Ahmed M., 2014, HEART DIS DIAGNOSIS
[2]   LOGISTIC-NORMAL DISTRIBUTIONS - SOME PROPERTIES AND USES [J].
AITCHISON, J ;
SHEN, SM .
BIOMETRIKA, 1980, 67 (02) :261-272
[3]  
[Anonymous], 2013, P 29 C UNC ART INT B
[4]  
[Anonymous], 2011, P ADV NEURAL INFORM
[5]   A CORRELATED TOPIC MODEL OF SCIENCE [J].
Blei, David M. ;
Lafferty, John D. .
ANNALS OF APPLIED STATISTICS, 2007, 1 (01) :17-35
[6]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[7]  
Cao Jie, 2017, Computer Engineering and Applications, V53, P187, DOI 10.3778/j.issn.1002-8331.1512-0234
[8]  
Chalapathy R., 2018, ECML PKDD 2018
[9]  
[陈蜜 CHEN Mi], 2006, [武汉大学学报. 信息科学版, Geomatics and information science of wuhan university.], V31, P55
[10]   Evolutionary multi-objective optimization based ensemble autoencoders for image outlier detection [J].
Chen, Zhaomin ;
Yeo, Chai Kiat ;
Lee, Bu Sung ;
Lau, Chiew Tong ;
Jin, Yaochu .
NEUROCOMPUTING, 2018, 309 :192-200