Distance Based Joint Probability Density Estimation For Unsupervised Outlier Detection

被引:0
作者
Rehman, Atiq Ur [1 ]
Belhaouari, Samir Brahim [1 ]
机构
[1] Hamad Bin Khalifa Univ, ICT Div, Coll Sci & Engn, Doha, Qatar
来源
2021 IEEE JORDAN INTERNATIONAL JOINT CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATION TECHNOLOGY (JEEIT) | 2021年
关键词
Anomaly detection; Data mining; Joint Probability Density Estimation; Outlier detection; Unsupervised learning;
D O I
10.1109/JEEIT53412.2021.9634099
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Outlier detection is a vital preprocessing step in data mining and it holds a great importance for Machine Learning (ML) algorithms. If a ML model is learned without removing the outliers from the data, the outliers present in the data can influence the prediction accuracy of a ML model and the outcome of such a model can be misleading. Keeping in view the importance of outliers detection, this paper proposes an unsupervised outlier detection mechanism. The proposed outlier detection mechanism is based on the Joint Probability Density Estimation (JPDE) with an integration of a Distance Measure (DM). The proposed approach has an advantage of utilizing only a single dimensional distance vector to compute the outliers in a dataset. This enables the proposed algorithm to find the outliers from a high dimensional dataset with low computational complexity. Furthermore, three different approaches based on JPDE-DM are proposed and evaluated using some complex benchmark synthetic datasets.
引用
收藏
页码:256 / 261
页数:6
相关论文
共 20 条
[1]  
Barreyre C., 2019, Space Operations: Inspiring Humankind's Future, P513
[2]  
Berger R., 2001, Statistical Inference
[3]  
Boukerche A, 2020, ACM COMPUT SURV, V53, DOI [10.1145/3381028, 10.1145/3421763]
[4]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[5]   On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study [J].
Campos, Guilherme O. ;
Zimek, Arthur ;
Sander, Jorg ;
Campello, Ricardo J. G. B. ;
Micenkova, Barbora ;
Schubert, Erich ;
Assent, Ira ;
Houle, Michael E. .
DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (04) :891-927
[6]  
Distribution N., GALE ENCY PSYCHOL
[7]   Study on Statistical Outlier Detection and Labelling [J].
Domanski, Pawel D. .
INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2020, 17 (06) :788-811
[8]  
Dong YH, 2019, ADV NEUR IN, V32
[9]   Robust deep auto-encoding Gaussian process regression for unsupervised anomaly detection [J].
Fan, Jinan ;
Zhang, Qianru ;
Zhu, Jialei ;
Zhang, Meng ;
Yang, Zhou ;
Cao, Hanxiang .
NEUROCOMPUTING, 2020, 376 :180-190
[10]  
Gong FL, 2017, 2017 INTERNATIONAL CONFERENCE ON SOCIAL SCIENCES, ARTS AND HUMANITIES (SSAH 2017), P90