Rank-based outlier detection

被引:30
作者
Huang, Huaming [1 ]
Mehrotra, Kishan [1 ]
Mohan, Chilukuri K. [1 ]
机构
[1] Syracuse Univ, Dept EECS, Syracuse, NY 13244 USA
关键词
outlier detection; ranking; neighbourhood sets;
D O I
10.1080/00949655.2011.621124
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We propose a new approach for outlier detection, based on a ranking measure that focuses on the question of whether a point is central' for its nearest neighbours. Using our notations, a low cumulative rank implies that the point is central. For instance, a point centrally located in a cluster has a relatively low cumulative sum of ranks because it is among the nearest neighbours of its own nearest neighbours, but a point at the periphery of a cluster has a high cumulative sum of ranks because its nearest neighbours are closer to each other than the point. Use of ranks eliminates the problem of density calculation in the neighbourhood of the point and this improves the performance. Our method performs better than several density-based methods on some synthetic data sets as well as on some real data sets.
引用
收藏
页码:518 / 531
页数:14
相关论文
共 18 条
[1]   Distance-based detection and prediction of outliers [J].
Angiulli, F ;
Basta, S ;
Pizzuti, C .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (02) :145-160
[2]   Outlier mining in large high-dimensional data sets [J].
Angiulli, F ;
Pizzuti, C .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (02) :203-215
[3]  
Baeza-Yates R.A., 1999, Modern Information Retrieval
[4]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[5]   Enhancing effectiveness of density-based outlier mining scheme with density-similarity-neighbor-based outlier factor [J].
Cao, Hui ;
Si, Gangquan ;
Zhang, Yanbin ;
Jia, Lixin .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) :8090-8101
[6]   FindOut: Finding Outliers in Very Large Datasets [J].
Yu, Dantong ;
Sheikholeslami, Gholamhosein ;
Zhang, Aidong .
Knowledge and Information Systems, 2002, Springer Science and Business Media Deutschland GmbH (04) :387-412
[7]  
Guha S., 1998, CURE, P73, DOI DOI 10.1145/276305.276312
[8]   Inlier-based Outlier Detection via Direct Density Ratio Estimation [J].
Hido, Shohei ;
Tsuboi, Yuta ;
Kashima, Hisashi ;
Sugiyama, Masashi ;
Kanamori, Takafumi .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :223-232
[9]   Some issues about outlier detection in rough set theory [J].
Jiang, Feng ;
Sui, Yuefei ;
Cao, Cungen .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) :4680-4687
[10]  
Jin W, 2006, LECT NOTES ARTIF INT, V3918, P577