Effective outlier detection method based on sparse representation

被引:0
作者
Xu X. [1 ,2 ]
Yao M. [2 ]
Liu H. [1 ]
机构
[1] College of Mathematics and Computer Science, Zhejiang Normal University, Jinhua, 321004, Zhejiang
[2] College of Information Engineering, Zhejiang University of Technology, Hangzhou
来源
| 1600年 / Huazhong University of Science and Technology卷 / 48期
关键词
High dimensional data; Nearest neighbourhood; Outlier; Sparse representation; Spectral clustering;
D O I
10.13245/j.hust.200704
中图分类号
学科分类号
摘要
In order to improve the performance of outlier detection algorithms on high-dimensional data, an efficient outlier detection algorithm based on sparse representation(ODSR) was presented.A matrix of neighborhood relationship was obtained according to sparse representation, in which each instance was represented as a sparse linear representation of others, and then the outliers by spectral clustering was identified.The sparse representation method has the advantage of selecting neighbors automatically, which could effectively obtain the information of neighborhood relationship, and overcome the difficulty of selecting parameter k in traditional nearest neighborhood based algorithms.Furthermore, the combination of sparse representation and spectral clustering could improve the accuracy of the results.The odsr was compared with 6 popular outlier detection algorithms in 11 real datasets.The results show that the complexity and AUC of ODSR are lower and the stability is higher. © 2020, Editorial Board of Journal of Huazhong University of Science and Technology. All right reserved.
引用
收藏
页码:20 / 25
页数:5
相关论文
共 30 条
  • [1] HODGE V, AUSTIN J., A survey of outlier detection methodologies, Artificial Intelligence Review, 22, 2, pp. 85-126, (2004)
  • [2] GOGOI P, BHATTACHARYYA D K, BORAH B, Et al., A survey of outlier detection methods in network anomaly identification, The Computer Journal, 54, 4, pp. 570-588, (2011)
  • [3] CHANDOLA V, BANERJEE A, KUMAR V., Anomaly detection: a survey, ACM Computing Surveys, 41, 3, pp. 1-58, (2009)
  • [4] GUPTA M, GAO J, AGGARWAL C C, Et al., Outlier detection for temporal data: a survey, IEEE Transactions on Knowledge and Data Engineering, 26, 9, pp. 2250-2267, (2014)
  • [5] ZIMEK A, SCHUBERT E, KRIEGEL H P., A survey on unsupervised outlier detection in high-dimensional numerical data, Statistical Analysis & Data Mining the Asa Data Science Journal, 5, 5, pp. 363-387, (2012)
  • [6] RAMASWAMY S, RASTOGI R, SHIM K., Efficient algorithms for mining outliers from large data sets, ACM SIGMOD Record, 29, 2, pp. 427-438, (2000)
  • [7] VILLE H, KARKKAINEN I, FRANTI P., Outlier detection using k-nearest neighbor graph, Proceedings of the 17th International Conference on Pattern Recogni-tion, pp. 430-433, (2004)
  • [8] HUANG H, MEHROTRA K, MOHAN C K., Rank-based outlier detection, Journal of Statistical Computation and Simulation, 83, 3, pp. 518-531, (2013)
  • [9] RADOVANOVIC M, NANOPOULOS A, IVANOVIC M., Reverse nearest neighbors in unsupervised distance-based outlier detection, IEEE Transactions on Know-ledge & Data Engineering, 27, 5, pp. 1369-1382, (2015)
  • [10] BREUNIG M M, KRIEGEL H P, NG R T., LOF: identifying density-based local outliers, ACM Sigmod Record, 29, 2, pp. 93-104, (2000)