DFNO: Detecting Fuzzy Neighborhood Outliers

被引:3
作者
Yuan, Zhong [1 ]
Hu, Peng [1 ]
Chen, Hongmei [2 ]
Chen, Yingke [3 ]
Li, Qilin [4 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu 611756, Peoples R China
[3] Northumbria Univ, Dept Comp & Informat Sci, Newcastle Upon Tyne, England
[4] State Grid Sichuan Elect Power Co, Chengdu 610045, Peoples R China
基金
中国国家自然科学基金;
关键词
Anomaly detection; Computational modeling; Data models; Nearest neighbor methods; Uncertainty; Numerical models; Measurement; Kernel; Data engineering; Indexes; Granular computing; fuzzy information granulation theory; fuzzy neighborhood; outlier detection; mixed-attribute data; INFORMATION GRANULATION;
D O I
10.1109/TKDE.2024.3484448
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Outlier Detection (OD) has attracted extensive research due to its application in many fields. The idea of neighborhood computing is one of the widely used methods in outlier analysis. Nevertheless, these methods mainly use certainty strategies to model outlier detection, so they cannot effectively handle the fuzzy information in the dataset. Moreover, they mainly focus on dealing with outlier detection in numerical data and cannot effectively find outliers in mixed-attribute data. Fuzzy information granulation theory is an effective granular computing model that allows objects to belong to a set to a certain extent (i.e., membership degree), which makes it possible to better handle uncertainty problems such as fuzziness. In this work, we propose an outlier detection model based on fuzzy neighborhoods. First, a hybrid fuzzy similarity is constructed to granulate the set of objects to form fuzzy information granules. Second, the fuzzy $k$k-nearest neighbor is defined to describe the fuzzy local information. Then, the fuzzy neighborhood density is defined to indicate the degree of aggregation of each object. The smaller the fuzzy neighborhood density of an object, the more likely it is to be an outlier. Based on this idea, the fuzzy neighborhood deviation degree is defined to quantify the degree of outliers of objects. Finally, the fuzzy deviation degree on the set of conditional attributes is constructed to indicate the outlier scores of objects. Experimental comparisons with state-of-the-art methods show that the proposed method has a significant improvement on the AUC index and applies to three types of data.
引用
收藏
页码:200 / 209
页数:10
相关论文
共 39 条
[1]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[2]   Neighborhood outlier detection [J].
Chen, Yumin ;
Miao, Duoqian ;
Zhang, Hongyun .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) :8745-8749
[3]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[4]   Information Granulation-Based Fuzzy Clustering of Time Series [J].
Guo, Hongyue ;
Wang, Lidong ;
Liu, Xiaodong ;
Pedrycz, Witold .
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (12) :6253-6261
[5]   A non-parameter outlier detection algorithm based on Natural Neighbor [J].
Huang, Jinlong ;
Zhu, Qingsheng ;
Yang, Lijun ;
Feng, Ji .
KNOWLEDGE-BASED SYSTEMS, 2016, 92 :71-77
[6]  
Jian Tang, 2002, Advances in Knowledge Discovery and Data Mining. 6th Pacific-Asia Conference, PAKDD 2002. Proceedings (Lecture Notes in Artificial Intelligence Vol.2336), P535
[7]   Outlier detection based on granular computing and rough set theory [J].
Jiang, Feng ;
Chen, Yu-Ming .
APPLIED INTELLIGENCE, 2015, 42 (02) :303-322
[8]  
Jin W, 2006, LECT NOTES ARTIF INT, V3918, P577
[9]  
Kiersztyn K., 2022, P 2022 IEEE INT C FU, P1
[10]  
Kriegel H.P., 2009, P 18 ACM C INF KNOWL, P1649, DOI [DOI 10.1145/1645953.1646195, 10.1145/1645953.1646195]