Automatic PAM clustering algorithm for outlier detection

被引:9
作者
Lei, Dajiang [1 ]
Zhu, Qingsheng [1 ]
Chen, Jun [1 ]
Lin, Hai [1 ]
Yang, Peng [1 ]
机构
[1] College of Computer, Chongqing University, Chongqing
关键词
Cluster validation; Outlier detection; PAM clustering algorithm; Subtractive clustering;
D O I
10.4304/jsw.7.5.1045-1051
中图分类号
学科分类号
摘要
In this paper, we propose an automatic PAM (Partition Around Medoids) clustering algorithm for outlier detection. The proposed methodology comprises two phases, clustering and finding outlying score. During clustering phase we automatically determine the number of clusters by combining PAM clustering algorithm and a specific cluster validation metric, which is vital to find a clustering solution that best fits the given data set, especially for PAM clustering algorithm. During finding outlier scores phase we decide outlying score of data instance corresponding to the cluster structure. Experiments on different datasets show that the proposed algorithm has higher detection rate go with lower false alarm rate comparing with the state of art outlier detection techniques, and it can be an effective solution for detecting outliers. © 2012 ACADEMY PUBLISHER.
引用
收藏
页码:1045 / 1051
页数:6
相关论文
共 21 条
[11]  
Breunig M.M., Kriegel H.P., Ng R.T., Sander J., LOF: Identifying density-based local outliers, Sigmod Record, 29, pp. 93-104, (2000)
[12]  
He Z.Y., Xu X.F., Deng S.C., Discovering clusterbased local outliers, Pattern Recognition Letters, 24, pp. 1641-1650, (2003)
[13]  
Jaing M.F., Tseng S.S., Su C.M., Two-phase clustering process for outliers detection, Pattern Recogn. Lett, 22, pp. 691-700, (2001)
[14]  
Chiu S.L., Extracting fuzzy rules for pattern classification by cluster estimation, Presented At the 6th Internat. Fuzzy Systems Association World Congress, (1995)
[15]  
Wang K., Wang B., Peng L., CVAP: Validation for Cluster Analyses, Data Science Journal, 8, pp. 88-93, (2009)
[16]  
Chen G., Jaradat S.A., Banerjee N., Tanaka T.S., Ko M.S.H., Zhang M.Q., Evaluation and Comparison of Clustering Algorithms in Anglyzing ES Cell Gene Expression Data, Statistica Sinica, 12, pp. 241-262, (2002)
[17]  
Kaufman L., Rousseeuw P., Finding Groups In Data: An Introduction to Cluster Analysis, (1990)
[18]  
Yang P., Huang B., An Outlier Detection Algorithm Based on Spectral Cluster, Presented At Proceedings of the 2008 IEEE Pacific-Asia Workshop On Computational Intelligence and Industrial Application, (2008)
[19]  
Cerioli A., Farcomeni A., Error rates for multivariate outlier detection, Computational Statistics and Data Analysis, 55, pp. 544-553, (2011)
[20]  
Davis J., Goadrich M., The relationship between Precision-Recall and ROC curves, Presented At the Proceedings of the 23rd International Conference On Machine Learning, (2006)