Outlier Removal in Model-Based Missing Value Imputation for Medical Datasets

被引:19
|
作者
Huang, Min-Wei [1 ,2 ]
Lin, Wei-Chao [3 ,4 ]
Tsai, Chih-Fong [5 ]
机构
[1] China Med Univ, Sch Med, Taichung, Taiwan
[2] Taichung Vet Gen Hosp, Dept Psychiat, Chiayi Branch, Chiayi, Taiwan
[3] Chang Gung Univ, Dept Informat Management, Taoyuan, Taiwan
[4] Chang Gung Mem Hosp, Dept Thorac Surg, Taoyuan, Taiwan
[5] Natl Cent Univ, Dept Informat Management, Taoyuan, Taiwan
关键词
INSTANCE SELECTION; CLASSIFICATION; REDUCTION;
D O I
10.1155/2018/1817479
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Many real-world medical datasets contain some proportion of missing (attribute) values. In general, missing value imputation can be performed to solve this problem, which is to provide estimations for the missing values by a reasoning process based on the (complete) observed data. However, if the observed data contain some noisy information or outliers, the estimations of the missing values may not be reliable or may even be quite different from the real values. The aim of this paper is to examine whether a combination of instance selection from the observed data and missing value imputation offers better performance than performing missing value imputation alone. In particular, three instance selection algorithms, DROP3, GA, and IB3, and three imputation algorithms, KNNI, MLP, and SVM, are used in order to find out the best combination. The experimental results show that that performing instance selection can have a positive impact on missing value imputation over the numerical data type of medical datasets, and specific combinations of instance selection and imputation methods can improve the imputation results over the mixed data type of medical datasets. However, instance selection does not have a definitely positive impact on the imputation result for categorical medical datasets.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] The Feature Selection Effect on Missing Value Imputation of Medical Datasets
    Liu, Chia-Hui
    Tsai, Chih-Fong
    Sue, Kuen-Liang
    Huang, Min-Wei
    APPLIED SCIENCES-BASEL, 2020, 10 (07):
  • [2] Normalization and outlier removal in class center-based firefly algorithm for missing value imputation
    Nugroho, Heru
    Utama, Nugraha Priya
    Surendro, Kridanto
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [3] Normalization and outlier removal in class center-based firefly algorithm for missing value imputation
    Heru Nugroho
    Nugraha Priya Utama
    Kridanto Surendro
    Journal of Big Data, 8
  • [4] Combining data discretization and missing value imputation for incomplete medical datasets
    Huang, Min-Wei
    Tsai, Chih-Fong
    Tsui, Shu-Ching
    Lin, Wei-Chao
    PLOS ONE, 2023, 18 (11):
  • [5] Model-based clustering and outlier detection with missing data
    Tong, Hung
    Tortora, Cristina
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2022, 16 (01) : 5 - 30
  • [6] Uncertainty Management in Model-Based Imputation for Missing Data
    Azarkhail, Mohammadreza
    Woytowitz, Peter
    59TH ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM (RAMS), 2013,
  • [7] Model-based clustering and outlier detection with missing data
    Hung Tong
    Cristina Tortora
    Advances in Data Analysis and Classification, 2022, 16 : 5 - 30
  • [8] Hybrid prediction model with missing value imputation for medical data
    Purwar, Archana
    Singh, Sandeep Kumar
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (13) : 5621 - 5631
  • [9] Mathura (MBI)-A novel imputation measure for imputation of missing values in medical datasets
    Mathura Bai B.
    Mangathayaru N.
    Padmaja Rani B.
    Aljawarneh S.
    Recent Advances in Computer Science and Communications, 2021, 14 (05) : 1358 - 1369
  • [10] Missing Values and Directional Outlier Detection in Model-Based Clustering
    Tong, Hung
    Tortora, Cristina
    JOURNAL OF CLASSIFICATION, 2024, 41 (03) : 480 - 513