A New Outlier Detection Model Using Random Walk on Local Information Graph

被引:21
作者
Wang, Chao [1 ]
Gao, Hui [1 ]
Liu, Zhen [1 ]
Fu, Yan
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China
来源
IEEE ACCESS | 2018年 / 6卷
基金
中国国家自然科学基金;
关键词
Outlier detection; global outlier; local outlier; Markov random walk; local information graph; ANOMALY DETECTION; FRAUD; ALGORITHM;
D O I
10.1109/ACCESS.2018.2883681
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large number of outlier detection methods have emerged in recent years due to their importance in many real-world applications. The graph-based methods, which can effectively capture the inter-dependencies of related objects, is one of the most powerful methods in this area. However, most of the graph-based methods ignore the local information around each node, which leads to a high false-positive rate for outlier detection. In this study, we present a new outlier detection model, which combines the graph representation with the local information around each object to construct a local information graph, and calculates the outlier score by performing a random walk process on the graph. Local information graph is constructed to capture the asymmetric inter-dependencies relationship between various types of data objects. Based on two different types of restart vectors to solve the dangling link problem, we propose two distinct algorithms for outlier detection. Experiments on synthetic datasets indicate that the proposed algorithms could efficiently distinguish outlier objects in different distributed datasets. Furthermore, the results on a number of real-world datasets also show that our approaches outperform the state-of-the-art outlier detection methods.
引用
收藏
页码:75531 / 75544
页数:14
相关论文
共 35 条
[1]   A survey of anomaly detection techniques in financial domain [J].
Ahmed, Mohiuddin ;
Mahmood, Abdun Naser ;
Islam, Md. Rafiqul .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 55 :278-288
[2]  
[Anonymous], 1990, PROC S APPL MATH
[3]  
[Anonymous], 1998, Cambridge Series in Statistical and Probabilistic Mathematics
[4]  
[Anonymous], 1999, TECH REP
[5]  
[Anonymous], 2012, DENUMERABLE MARKOV C
[6]  
[Anonymous], MATH STAT METHODS AC
[7]  
[Anonymous], INT J COMPUT APPL
[8]  
Bolton R. J., 2001, CREDIT SCORING CREDI, P235
[9]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[10]   On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study [J].
Campos, Guilherme O. ;
Zimek, Arthur ;
Sander, Jorg ;
Campello, Ricardo J. G. B. ;
Micenkova, Barbora ;
Schubert, Erich ;
Assent, Ira ;
Houle, Michael E. .
DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (04) :891-927