Semi-supervised outlier detection based on fuzzy rough C-means clustering

被引:70
作者
Xue, Zhenxia [1 ]
Shang, Youlin [1 ]
Feng, Aifen [1 ]
机构
[1] Henan Univ Sci & Technol, Sch Math & Stat, Luoyang, Peoples R China
关键词
Pattern recognition; Outlier detection; Semi-supervised learning; Rough sets; Fuzzy sets; C-means clustering;
D O I
10.1016/j.matcom.2010.02.007
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents a fuzzy rough semi-supervised outlier detection (FRSSOD) approach with the help of some labeled samples and fuzzy rough C-means clustering. This method introduces an objective function, which minimizes the sum squared error of clustering results and the deviation from known labeled examples as well as the number of outliers. Each cluster is represented by a center, a crisp lower approximation and a fuzzy boundary by using fuzzy rough C-means clustering and only those points located in boundary can be further discussed the possibility to be reassigned as outliers. As a result, this method can obtain better clustering results for normal points and better accuracy for outlier detection. Experiment results show that the proposed method, on average, keep, or improve the detection precision and reduce false alarm rate as well as reduce the number of candidate outliers to be discussed. (C) 2010 IMACS. Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:1911 / 1921
页数:11
相关论文
共 21 条
[1]  
[Anonymous], 1994, Wiley series in probability and mathematical statistics applied probability and statistics
[2]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[3]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[4]  
Chimphlee W, 2006, 2006 International Conference on Hybrid Information Technology, Vol 1, Proceedings, P329
[5]  
Dunn J. C., 1974, Journal of Cybernetics, V4, P1, DOI 10.1080/01969727408546062
[6]  
Gao J, 2006, SIAM PROC S, P594
[7]   PROCEDURES FOR DETECTING OUTLYING OBSERVATIONS IN SAMPLES [J].
GRUBBS, FE .
TECHNOMETRICS, 1969, 11 (01) :1-&
[8]  
Hu QH, 2005, LECT NOTES ARTIF INT, V3613, P494
[9]  
Jin W., 2001, Proc. of the Intl. Conference on Knowledge Discovery and Data Mining, P293
[10]  
Jing Gao, 2006, Applied Computing 2006. 21st Annual ACM Symposium on Applied Computing, P635