Locally centred Mahalanobis distance: A new distance measure with salient features towards outlier detection

被引:51
|
作者
Todeschini, Roberto [1 ]
Ballabio, Davide [1 ]
Consonni, Viviana [1 ]
Sahigara, Faizan [1 ]
Filzmoser, Peter [2 ]
机构
[1] Univ Milano Bicocca, Dept Earth & Environm Sci, Milano Chemometr & QSAR Res Grp, I-20126 Milan, Italy
[2] Vienna Univ Technol, Dept Stat & Probabil Theory, A-1040 Vienna, Austria
关键词
Mahalanobis distance; Outlier detection; Similarity; Isolation degree; Remoteness; Covariance matrix; Data mining;
D O I
10.1016/j.aca.2013.04.034
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Outlier detection is a prerequisite to identify the presence of aberrant samples in a given set of data. The identification of such diverse data samples is significant particularly for multivariate data analysis where increasing data dimensionality can easily hinder the data exploration and such outliers often go undetected. This paper is aimed to introduce a novel Mahalanobis distance measure (namely, a pseudo-distance) termed as locally centred Mahalanobis distance, derived by centering the covariance matrix at each data sample rather than at the data centroid as in the classical covariance matrix. Two parameters, called as Remoteness and Isolation degree, were derived from the resulting pairwise distance matrix and their salient features facilitated a better identification of atypical samples isolated from the rest of the data, thus reflecting their potential application towards outlier detection. The Isolation degree demonstrated to be able to detect a new kind of outliers, that is, isolated samples within the data domain, thus resulting in a useful diagnostic tool to evaluate the reliability of predictions obtained by local models (e.g. k-NN models). To better understand the role of Remoteness and Isolation degree in identification of such aberrant data samples, some simulated and published data sets from literature were considered as case studies and the results were compared with those obtained by using Euclidean distance and classical Mahalanobis distance. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [41] A new kernelization framework for Mahalanobis distance learning algorithms
    Chatpatanasiri, Ratthachat
    Korsrilabutr, Teesid
    Tangchanachaianan, Pasakorn
    Kijsirikul, Boonserm
    NEUROCOMPUTING, 2010, 73 (10-12) : 1570 - 1579
  • [42] Outlier Detection Based on Reversed K-Nearest Neighborhood MST of Relative Distance Measure
    Yang X.-L.
    Feng S.
    Yuan Z.
    Feng, Shan (fengshanrq@sohu.com), 2020, Chinese Institute of Electronics (48): : 937 - 945
  • [43] Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis Distance
    Das, Sourya Dipta
    Vadi, Yash
    Unnam, Abhishek
    Yadav, Kuldeep
    INTERSPEECH 2023, 2023, : 1978 - 1982
  • [44] Anomaly detection of tripod shafts using modified Mahalanobis distance
    Sunghyun Lee
    Jong-Won Park
    Do-Sik Kim
    Insu Jeon
    Dong-Cheon Baek
    Journal of Mechanical Science and Technology, 2018, 32 : 2473 - 2478
  • [45] Research on the detection method of driver fatigue based on Mahalanobis Distance
    Qi Yu-ming
    Deng San-peng
    Wang Qian
    Miao De-hua
    Guo Shi-jie
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL IX, 2010, : 77 - 80
  • [46] Visual Saliency Detection Based on Mahalanobis Distance and Feature Evaluation
    Yao, Zhijun
    2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 251 - 255
  • [47] Sensor Fault Detection Based on Particle Filter and Mahalanobis Distance
    Li, Tianzhi
    Liu, Gang
    Zhang, Liangliang
    JORDAN JOURNAL OF CIVIL ENGINEERING, 2019, 13 (04) : 501 - 507
  • [48] Statistics Mahalanobis distance for incipient sensor fault detection and diagnosis
    Ji, Hongquan
    CHEMICAL ENGINEERING SCIENCE, 2021, 230
  • [49] Hardware Trojan Detection Based on Cluster Analysis of Mahalanobis Distance
    Cui, Qi
    Zhang, Lei
    Sun, Kewang
    Li, Dongxu
    Wang, Sixiang
    2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 1, 2016, : 234 - 238
  • [50] Anomaly detection of tripod shafts using modified Mahalanobis distance
    Lee, Sunghyun
    Park, Jong-Won
    Kim, Do-Sik
    Jeon, Insu
    Baek, Dong-Cheon
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2018, 32 (06) : 2473 - 2478