Data-driven cluster analysis method: a novel outliers detection method in multivariate data

被引:0
作者
Duarte, A. R. [1 ]
Barbosa, J. J. [1 ]
Martins, H. S. R. [1 ]
Oliveira, F. L. P. [1 ]
机构
[1] Univ Fed Ouro Preto, Stat Dept, Ouro Preto, Brazil
关键词
Data-driven; Multivariate outliers; Cluster analysis; Bayesian information criterion; Accuracy; MAHALANOBIS DISTANCE; IDENTIFICATION;
D O I
10.1080/03610918.2024.2376872
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Detection of multivariate outliers is crucial in statistical studies. On the other hand, the statistical applications without identifying possible outliers may present incorrect results. This study proposes a new technique for detecting multivariate outliers based on cluster analysis. The method considers information inherent in the data itself. We compare the methodology with three detection methods that are already widespread. The comparative investigation considers detection techniques based on the Mahalanobis distance. Sensitivity, specificity, and accuracy measures are used to assess the quality of the methods, as well as an analysis of the CPU time required to carry out the procedures. The new technique revealed a notorious superiority over others.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] A novel method for quantitative fault diagnosis of photovoltaic systems based on data-driven
    Guo, Hui
    Hu, Shan
    Wang, Fei
    Zhang, Lijun
    ELECTRIC POWER SYSTEMS RESEARCH, 2022, 210
  • [42] Robust detection of multiple outliers in grouped multivariate data
    Caroni, Chrys
    Billor, Nedret
    JOURNAL OF APPLIED STATISTICS, 2007, 34 (10) : 1241 - 1250
  • [43] A novel capacity demand analysis method of energy storage system for peak shaving based on data-driven
    Hong, Zhenpeng
    Wei, Zixuan
    Li, Jianlin
    Han, Xiaojuan
    JOURNAL OF ENERGY STORAGE, 2021, 39 (39):
  • [44] A Data-Driven Polarimetric Calibration Method for Entomological Radar
    Hu, Cheng
    Li, Muyang
    Li, Weidong
    Wang, Rui
    Yu, Teng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [45] A Data-Driven Fault Prediction Method for Power Transformers
    Chen, Zhuo
    Chen, Junxingxu
    Qiao, Hong
    Xu, Xianyong
    Xiao, Jian
    Long, Yanbo
    2021 13TH INTERNATIONAL CONFERENCE ON MEASURING TECHNOLOGY AND MECHATRONICS AUTOMATION (ICMTMA 2021), 2021, : 145 - 149
  • [46] Data-driven crowd evacuation: A reinforcement learning method
    Yao, Zhenzhen
    Zhang, Guijuan
    Lu, Dianjie
    Liu, Hong
    NEUROCOMPUTING, 2019, 366 : 314 - 327
  • [47] A Data-Driven Heuristic Method for Irregular Flight Recovery
    Wang, Nianyi
    Wang, Huiling
    Pei, Shan
    Zhang, Boyu
    MATHEMATICS, 2023, 11 (11)
  • [48] Data-Driven Tie-line Scheduling Method
    Cai, Zhi
    Dai, Sai
    Dai, Yuhan
    Zhang, Guofang
    Lu, Yi
    PROCEEDINGS OF 2020 IEEE 9TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS'20), 2020, : 834 - 838
  • [49] A data-driven method for estimating wheel flat length
    Ye, Yunguang
    Shi, Dachuan
    Krause, Philipp
    Hecht, Markus
    VEHICLE SYSTEM DYNAMICS, 2020, 58 (09) : 1329 - 1347
  • [50] INVERSION BASED DATA-DRIVEN ATTENUATION COMPENSATION METHOD
    Wang, Benfeng
    Chen, Xiaohong
    Li, Jingye
    JOURNAL OF SEISMIC EXPLORATION, 2014, 23 (04): : 341 - 356