RVAE-ABFA : Robust Anomaly Detection for High Dimensional Data Using Variational Autoencoder

被引:5
作者
Gao, Yuda [1 ]
Shi, Bin [1 ]
Dong, Bo [2 ]
Chen, Yan [1 ]
Mi, Lingyun [1 ]
Huang, Zhiping [3 ]
Shi, Yuanyuan [3 ]
机构
[1] Xi An Jiao Tong Univ, Sch Comp Sci & Technol, SPKLSTN Lab, Xian, Peoples R China
[2] Xi An Jiao Tong Univ, Sch Distance Educ, Natl Engn Lab Big Data Analyt, Xian, Peoples R China
[3] SERVYOU Grp, Hangzhou, Peoples R China
来源
2020 IEEE 44TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2020) | 2020年
基金
美国国家科学基金会;
关键词
D O I
10.1109/COMPSAC48688.2020.0-224
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The curse of dimensionality is a fundamental difficulty in anomaly detection for high dimensional data. To deal with this problem, the autoencoder based approach is an elegant solution. However, existing works require a clean training dataset that is not always guaranteed in real scenarios. In this paper, we propose a novel anomaly detection method named RVAE-ABFA (robust variational autoencoder with attention based feature adaptation for high dimensional data anomaly detection), which significantly improves the anomaly detection performance when training data is contaminated. Rather than only utilize reconstruction error, we take the learned low dimensional embeddings generated by variational autoencoder into consideration. In RVAE-ABFA, the learned low dimensional embeddings are helpful to detect anomalies in contaminated data because of the ability of variational inference. We also propose an ABFA (attention based feature adaptation) mechanism to adjust the weights of low dimensional embeddings and reconstruction error. Furthermore, we adopt the adversarial training criterion to perform variational inference by the adversarial network named RAAE-ABFA (robust adversarial autoencoder with attention based feature adaptation for high dimensional data anomaly detection) in which we can generate extra samples when training data is not enough. Experimental results on several benchmark datasets show that the proposed method significantly outperforms state-of-the-art unsupervised anomaly detection methods and is more robust when training data is contaminated.
引用
收藏
页码:334 / 339
页数:6
相关论文
共 43 条
[1]  
Aggarwal C. C., 2015, Data Mining, P237
[2]  
An J., 2015, Technical Report, V2, P1
[3]  
Andrews J. T., 2016, INT J MACHINE LEARNI
[4]  
[Anonymous], 2008, P 25 INT C MACH LEAR, DOI DOI 10.1145/1390156.1390294
[5]  
[Anonymous], 2017, INFORM SCIENCES
[6]  
[Anonymous], 2017, AUTOMATIC DIFFERENTI
[7]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[8]   Anomaly Detection: A Survey [J].
Chandola, Varun ;
Banerjee, Arindam ;
Kumar, Vipin .
ACM COMPUTING SURVEYS, 2009, 41 (03)
[9]  
Dua D, 2017, UCI machine learning repository
[10]  
Eskin E., 2000, ANOMALY DETECTION NO