A hybrid approach to outlier detection based on boundary region

被引:23
作者
Jiang, Feng [1 ]
Sui, Yuefei [2 ]
Cao, Cungen [2 ]
机构
[1] Qingdao Univ Sci & Technol, Coll Informat Sci & Technol, Qingdao 266061, Peoples R China
[2] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100080, Peoples R China
基金
中国国家自然科学基金;
关键词
Outlier detection; Rough sets; Boundary; Distance; KDD;
D O I
10.1016/j.patrec.2011.07.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, much attention has been given to the problem of outlier detection, whose aim is to detect outliers - objects who behave in an unexpected way or have abnormal properties. The identification of outliers is important for many applications such as intrusion detection, credit card fraud, criminal activities in electronic commerce, medical diagnosis and anti-terrorism, etc. In this paper, we propose a hybrid approach to outlier detection, which combines the opinions from boundary-based and distance-based methods for outlier detection (Jiang et al., 2005, 2009; Knorr and Ng, 1998). We give a novel definition of outliers - BD (boundary and distance)-based outliers, by virtue of the notion of boundary region in rough set theory and the definitions of distance-based outliers. An algorithm to find such outliers is also given. And the effectiveness of our method for outlier detection is demonstrated on two publicly available databases. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:1860 / 1870
页数:11
相关论文
共 34 条
[1]  
Aggarwal C. C., 2001, SIGMOD Record, V30, P37, DOI 10.1145/376284.375668
[2]  
Angiulli F., 2002, Principles of Data Mining and Knowledge Discovery. 6th European Conference, PKDD 2002. Proceedings (Lecture Notes in Artificial Intelligence Vol.2431), P15
[3]  
[Anonymous], 1992, Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, DOI DOI 10.1007/978-94-015-7975-9_21
[4]  
Barnett V., 1994, Outliers in statistical data
[5]  
Bay S.D., 1999, UCI KDD REPOSITORY
[6]  
Bolton RJ, 2002, STAT SCI, V17, P235
[7]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[8]   Neighborhood outlier detection [J].
Chen, Yumin ;
Miao, Duoqian ;
Zhang, Hongyun .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) :8745-8749
[9]  
Chen YM, 2008, LECT NOTES ARTIF INT, V5306, P283, DOI 10.1007/978-3-540-88425-5_29
[10]  
Eskin Eleazar., 2002, DATA MINING SECURITY