A robust anomaly detection algorithm based on principal component analysis

被引:6
作者
Huang, Yingkun [1 ]
Jin, Weidong [1 ]
Yu, Zhibin [1 ]
Li, Bing [1 ]
机构
[1] Southwest Jiao Tong Univ, Coll Elect Engn, Chengdu 610031, Sichuan, Peoples R China
关键词
Anomaly detection; principal component analysis (PCA); location and scale; median absolute deviation (MAD); PCA;
D O I
10.3233/IDA-195054
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Quantifying the abnormal degree of each instance within data sets to detect outlying instances, is an issue in unsupervised anomaly detection research. In this paper, we propose a robust anomaly detection method based on principal component analysis (PCA). Traditional PCA-based detection algorithms commonly obtain a high false alarm for the outliers. The main reason is that ignores the difference of location and scale to each component of the outlier score, this leads to the cumulated outlier score deviates from the true values. To address the issue, we introduce the median and the Median Absolute Deviation (MAD) to rescale each outlier score that mapped onto the corresponding principal direction. And then, the true outlier scores of instances can be obtained as the sum of weighted squares of the rescaled scores. Also, the issue that the assignment of the weight for each outlier score will be solved. The main advantage of our new approach is easy to build with unsupervised data and the recognition performance is better than the classical PCA-based methods. We compare our method to the five different anomaly detection techniques, including two traditional PCA-based methods, in our experiment analysis. The experimental results show that the proposed method has a good performance for effectiveness, efficiency, and robustness.
引用
收藏
页码:249 / 263
页数:15
相关论文
共 33 条
  • [1] Akcay S., 14 AS C COMP VIS ACC, P622
  • [2] Angiulli F., 2002, Principles of Data Mining and Knowledge Discovery. 6th European Conference, PKDD 2002. Proceedings (Lecture Notes in Artificial Intelligence Vol.2431), P15
  • [3] [Anonymous], 2017, ARXIV170503800
  • [4] LOF: Identifying density-based local outliers
    Breunig, MM
    Kriegel, HP
    Ng, RT
    Sander, J
    [J]. SIGMOD RECORD, 2000, 29 (02) : 93 - 104
  • [5] Algorithms for Projection - Pursuit robust principal component analysis
    Croux, C.
    Filzmoser, P.
    Oliveira, M. R.
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2007, 87 (02) : 218 - 225
  • [6] Das R., 17 IEEE INT C MACH L, P152
  • [7] A Compressed PCA Subspace Method for Anomaly Detection in High-Dimensional Data
    Ding, Qi
    Kolaczyk, Eric D.
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2013, 59 (11) : 7419 - 7433
  • [8] Principal component analysis for compositional data with outliers
    Filzmoser, Peter
    Hron, Karel
    Reimann, Clemens
    [J]. ENVIRONMETRICS, 2009, 20 (06) : 621 - 632
  • [9] Goldstein M., KI2012 POSTER DEMO T, P59
  • [10] A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data
    Goldstein, Markus
    Uchida, Seiichi
    [J]. PLOS ONE, 2016, 11 (04):