Data-driven cluster analysis method: a novel outliers detection method in multivariate data

被引:0
|
作者
Duarte, A. R. [1 ]
Barbosa, J. J. [1 ]
Martins, H. S. R. [1 ]
Oliveira, F. L. P. [1 ]
机构
[1] Univ Fed Ouro Preto, Stat Dept, Ouro Preto, Brazil
关键词
Data-driven; Multivariate outliers; Cluster analysis; Bayesian information criterion; Accuracy; MAHALANOBIS DISTANCE; IDENTIFICATION;
D O I
10.1080/03610918.2024.2376872
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Detection of multivariate outliers is crucial in statistical studies. On the other hand, the statistical applications without identifying possible outliers may present incorrect results. This study proposes a new technique for detecting multivariate outliers based on cluster analysis. The method considers information inherent in the data itself. We compare the methodology with three detection methods that are already widespread. The comparative investigation considers detection techniques based on the Mahalanobis distance. Sensitivity, specificity, and accuracy measures are used to assess the quality of the methods, as well as an analysis of the CPU time required to carry out the procedures. The new technique revealed a notorious superiority over others.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] ON ROHLFS METHOD FOR THE DETECTION OF OUTLIERS IN MULTIVARIATE DATA
    CARONI, C
    PRESCOTT, P
    JOURNAL OF MULTIVARIATE ANALYSIS, 1995, 52 (02) : 295 - 307
  • [2] A Novel Data-Driven Fault Detection Method Inspired by Parallel Distributed Compensation
    Chen Zhaoxu
    Fang Huajing
    2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 6314 - 6319
  • [3] A Novel Hybrid Data-Driven Modeling Method for Missiles
    He, Yongxiang
    Guo, Hongwu
    Han, Yang
    SYMMETRY-BASEL, 2020, 12 (01):
  • [4] Detection of Outliers Method in Grouped Multivariate Data: A Method Based on Multiple Linear Regression
    Phuttisen, Suthat
    Srisodaphol, Wuttichai
    PAKISTAN JOURNAL OF STATISTICS AND OPERATION RESEARCH, 2024, 20 (03) : 445 - 453
  • [5] A data-driven method for the deformation analysis of layered rocks
    Feng, Fanding
    Yang, Diansen
    Jiang, Qinghui
    INTERNATIONAL JOURNAL OF ROCK MECHANICS AND MINING SCIENCES, 2025, 186
  • [6] Data-driven Research Method For Power System Stability Detection
    Jia Tianxia
    Gu Zhuoyuan
    Sun Huadong
    Gao Pengfei
    Yi Jun
    Xu Shiyun
    Zhao Bing
    2018 INTERNATIONAL CONFERENCE ON POWER SYSTEM TECHNOLOGY (POWERCON), 2018, : 3061 - 3069
  • [7] Data-driven method of damage detection using sparse sensors installation by SEREPa
    Ghannadi, Parsa
    Kourehli, Seyed Sina
    JOURNAL OF CIVIL STRUCTURAL HEALTH MONITORING, 2019, 9 (04) : 459 - 475
  • [8] Novel subgroups of obesity and their association with outcomes: a data-driven cluster analysis
    Takeshita, Saki
    Nishioka, Yuichi
    Tamaki, Yuko
    Kamitani, Fumika
    Mohri, Takako
    Nakajima, Hiroki
    Kurematsu, Yukako
    Okada, Sadanori
    Myojin, Tomoya
    Noda, Tatsuya
    Imamura, Tomoaki
    Takahashi, Yutaka
    BMC PUBLIC HEALTH, 2024, 24 (01)
  • [9] Data-Driven Method for Missing Harmonic Data Completion
    Xu, Rui
    Ma, Xiaoyang
    Zhou, Runze
    Zhao, Jinshuai
    Wang, Ying
    IEEE ACCESS, 2021, 9 : 164037 - 164046
  • [10] A Data-driven Fault Detection Method Based on Dissipative Trajectories
    Lei, Qingyang
    Munir, Muhammad Tajarnrnal
    Bao, Jie
    Young, Brent
    IFAC PAPERSONLINE, 2016, 49 (07): : 717 - 722