Missing Data Imputation Using Ensemble Learning Technique: A Review

被引:1
作者
Jegadeeswari, K. [1 ]
Ragunath, R. [1 ]
Rathipriya, R. [1 ]
机构
[1] Periyar Univ, Dept Comp Sci, Salem, Tamil Nadu, India
来源
SOFT COMPUTING FOR SECURITY APPLICATIONS, ICSCS 2022 | 2023年 / 1428卷
关键词
Missing data imputations; Ensemble leaning; Bagging; Boosting; Stacking and bioinformatics; MULTIPLE IMPUTATION; MICROARRAY DATA; INCOMPLETE DATA; PREDICTION;
D O I
10.1007/978-981-19-3590-9_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For the past two decades, several studies have been conducted on missing value imputation in bioinformatics and offered the best method or approach for handling the datasets with missing values. When the datasets have a lesser amount of missing attribute values in the entire database, the missing attribute values be able to remove from the dataset without taking a noteworthy influence on the final mine. However, if a huge number of attribute values are missing, suspicious attention should be given to handle these kinds missing data because the entire dataset will lose their valuable information and the quality of the datasets. In particular, datasets have more than one missing attribute value disturb the algorithms performance. Missing value imputation method's aim is to provide high-quality dataset without loss of any valuable information intelligently where the missing values are smaller or larger. Meanwhile ensemble learning techniques are achieving high performance in data mining task for the past few years. Researchers, therefore, prefer to work on the imputation of missing data using ensemble learning, a technique that cannot be ignored nowadays because missing data in bioinformatics datasets are rapidly increasing. Ensemble learning aim is transforms from weak learner to strong learner. Those ensemble techniques can process a massive amount of data in an efficient manner. This paper concentrates on the review of missing value imputation techniques and ensemble learning models for analyzing biological data.
引用
收藏
页码:223 / 236
页数:14
相关论文
共 45 条
  • [1] The Ability of Different Imputation Methods to Preserve the Significant Genes and Pathways in Cancer
    Aghdam, Rosa
    Baghfalaki, Taban
    Khosravi, Pegah
    Ansari, Elnaz Saberi
    [J]. GENOMICS PROTEOMICS & BIOINFORMATICS, 2017, 15 (06) : 396 - 404
  • [2] Dealing with missing values in large-scale studies: microarray data imputation and beyond
    Aittokallio, Tero
    [J]. BRIEFINGS IN BIOINFORMATICS, 2010, 11 (02) : 253 - 264
  • [3] Al-Helali B., 2010, J COUNS PSYCHOL
  • [4] A new imputation method based on genetic programming and weighted KNN for symbolic regression with incomplete data
    Al-Helali, Baligh
    Chen, Qi
    Xue, Bing
    Zhang, Mengjie
    [J]. SOFT COMPUTING, 2021, 25 (08) : 5993 - 6012
  • [5] Impact of missing data on bias and precision when estimating change in patient-reported outcomes from a clinical registry
    Ayilara, Olawale F.
    Zhang, Lisa
    Sajobi, Tolulope T.
    Sawatzky, Richard
    Bohm, Eric
    Lix, Lisa M.
    [J]. HEALTH AND QUALITY OF LIFE OUTCOMES, 2019, 17 (1)
  • [6] Ensemble feature learning for material recognition with convolutional neural networks
    Bian, Peng
    Li, Wanwan
    Jin, Yi
    Zhi, Ruicong
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2018,
  • [7] Multiple Imputation for Missing Data via Sequential Regression Trees
    Burgette, Lane F.
    Reiter, Jerome P.
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2010, 172 (09) : 1070 - 1076
  • [8] Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments
    Celton, Magalie
    Malpertuy, Alain
    Lelandais, Gaelle
    de Brevern, Alexandre G.
    [J]. BMC GENOMICS, 2010, 11
  • [9] Chen C., 2015, J COMPUTER COMMUNICA, V3, P1
  • [10] Ensemble correlation-based low-rank matrix completion with applications to traffic data imputation
    Chen, Xiaobo
    Wei, Zhongjie
    Li, Zuoyong
    Liang, Jun
    Cai, Yingfeng
    Zhang, Bob
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 132 : 249 - 262