Missing data techniques in classification for cardiovascular dysautonomias diagnosis

被引:0
作者
Ali Idri
Ilham Kadi
Ibtissam Abnane
José Luis Fernandez-Aleman
机构
[1] Mohammed V University,Software Project Management Research Team
[2] Mohammed VI Polytechnic University,CSEHS
[3] University of Murcia,MSDA
来源
Medical & Biological Engineering & Computing | 2020年 / 58卷
关键词
Missing data; KNN imputation; Missingness mechanism; Cardiology;
D O I
暂无
中图分类号
学科分类号
摘要
Missing data (MD) is a common and inevitable problem facing data mining (DM)–based decision systems in e-health since many medical historical datasets contain a huge number of missing values. Therefore, a pre-processing stage is usually required to deal with missing values before building any DM–based decision system. The purpose of this paper is to evaluate the impact of MD techniques on classification systems in cardiovascular dysautonomias diagnosis. We analyzed and compared the accuracy rates of four classification techniques: random forest (RF), support vector machines (SVM), C4.5 decision tree, and Naive Bayes (NB), using two MD techniques: deletion or imputation with k-nearest neighbors (KNN). A total of 216 experiments were therefore carried out using three missingness mechanisms (MCAR: missing completely at random, MAR: missing at random and NMAR: not missing at random), two MD techniques (deletion and KNN imputation), nine MD percentages from 10 to 90% over a dataset collected from the autonomic nervous system (ANS) unit of the University Hospital Avicenne in Morocco. The results obtained suggest that using KNN imputation rather than deletion enhances the accuracy rates of the four classifiers. Moreover, the MD percentages have a negative impact on the performance of classification techniques regardless of the MD mechanisms and MD techniques used. In fact, the accuracy rates of the four classifiers decrease as the MD percentage increases.
引用
收藏
页码:2863 / 2878
页数:15
相关论文
共 50 条
  • [41] Gas Pressure Prediction and Application with Missing Data Imputation Techniques for Gas Regulator Data
    Park, Hyunwoo
    Jin, Seohoon
    BIG DATA AND SECURITY, ICBDS 2023, PT I, 2024, 2099 : 89 - 104
  • [42] Fault Diagnosis Based on Deep Learning Subject to Missing Data
    Liu, Weibo
    Wei, Dan
    Zhou, Funa
    PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 3972 - 3977
  • [43] The development of delinquency during adolescence: a comparison of missing data techniques revisited
    Kleinke K.
    Reinecke J.
    Weins C.
    Quality & Quantity, 2021, 55 (3) : 877 - 895
  • [44] Imputing cross-sectional missing data: comparison of common techniques
    Hawthorne, G
    Elliott, P
    AUSTRALIAN AND NEW ZEALAND JOURNAL OF PSYCHIATRY, 2005, 39 (07) : 583 - 590
  • [45] Reconstruction of missing data using compressed sensing techniques with adaptive dictionary
    Perepu, Satheesh K.
    Tangirala, Arun K.
    JOURNAL OF PROCESS CONTROL, 2016, 47 : 175 - 190
  • [46] Missing data imputation techniques for wireless continuous vital signs monitoring
    Mathilde C. van Rossum
    Pedro M. Alves da Silva
    Ying Wang
    Ewout A. Kouwenhoven
    Hermie J. Hermens
    Journal of Clinical Monitoring and Computing, 2023, 37 : 1387 - 1400
  • [47] Techniques for handling convolutional distortion with 'missing data' automatic speech recognition
    Palomäki, KJ
    Brown, GJ
    Barker, JP
    SPEECH COMMUNICATION, 2004, 43 (1-2) : 123 - 142
  • [48] A Benchmark for Missing Data Imputation Techniques: Development Perspectives and Performance Comparative
    Cabrera-Sanchez, Juan-Francisco
    Cruz-Corona, Carlos
    Escolano, Andres Yanez
    Silva-Ramirez, Esther-Lydia
    OPTIMIZATION AND LEARNING, OLA 2024, 2025, 2311 : 140 - 153
  • [49] Imputation techniques on missing values in breast cancer treatment and fertility data
    Xuetong Wu
    Hadi Akbarzadeh Khorshidi
    Uwe Aickelin
    Zobaida Edib
    Michelle Peate
    Health Information Science and Systems, 7
  • [50] Impact of missing data imputation methods on gene expression clustering and classification
    de Souto, Marcilio C. P.
    Jaskowiak, Pablo A.
    Costa, Ivan G.
    BMC BIOINFORMATICS, 2015, 16