Missing Data Imputation Using Ensemble Learning Technique: A Review

被引：1

作者：

Jegadeeswari, K. ^{[1
]}

Ragunath, R. ^{[1
]}

Rathipriya, R. ^{[1
]}

机构：

[1] Periyar Univ, Dept Comp Sci, Salem, Tamil Nadu, India

来源：

SOFT COMPUTING FOR SECURITY APPLICATIONS, ICSCS 2022 | 2023年 / 1428卷

关键词：

Missing data imputations; Ensemble leaning; Bagging; Boosting; Stacking and bioinformatics; MULTIPLE IMPUTATION; MICROARRAY DATA; INCOMPLETE DATA; PREDICTION;

D O I：

10.1007/978-981-19-3590-9_18

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For the past two decades, several studies have been conducted on missing value imputation in bioinformatics and offered the best method or approach for handling the datasets with missing values. When the datasets have a lesser amount of missing attribute values in the entire database, the missing attribute values be able to remove from the dataset without taking a noteworthy influence on the final mine. However, if a huge number of attribute values are missing, suspicious attention should be given to handle these kinds missing data because the entire dataset will lose their valuable information and the quality of the datasets. In particular, datasets have more than one missing attribute value disturb the algorithms performance. Missing value imputation method's aim is to provide high-quality dataset without loss of any valuable information intelligently where the missing values are smaller or larger. Meanwhile ensemble learning techniques are achieving high performance in data mining task for the past few years. Researchers, therefore, prefer to work on the imputation of missing data using ensemble learning, a technique that cannot be ignored nowadays because missing data in bioinformatics datasets are rapidly increasing. Ensemble learning aim is transforms from weak learner to strong learner. Those ensemble techniques can process a massive amount of data in an efficient manner. This paper concentrates on the review of missing value imputation techniques and ensemble learning models for analyzing biological data.

引用

页码：223 / 236

页数：14

共 45 条

[1] The Ability of Different Imputation Methods to Preserve the Significant Genes and Pathways in Cancer
Aghdam, Rosa
Baghfalaki, Taban
Khosravi, Pegah
Ansari, Elnaz Saberi
[J]. GENOMICS PROTEOMICS & BIOINFORMATICS, 2017, 15 (06) : 396 - 404
[2] Dealing with missing values in large-scale studies: microarray data imputation and beyond
Aittokallio, Tero
[J]. BRIEFINGS IN BIOINFORMATICS, 2010, 11 (02) : 253 - 264
[3] Al-Helali B., 2010, J COUNS PSYCHOL
[4] A new imputation method based on genetic programming and weighted KNN for symbolic regression with incomplete data
Al-Helali, Baligh
Chen, Qi
Xue, Bing
Zhang, Mengjie
[J]. SOFT COMPUTING, 2021, 25 (08) : 5993 - 6012
[5] Impact of missing data on bias and precision when estimating change in patient-reported outcomes from a clinical registry
Ayilara, Olawale F.
Zhang, Lisa
Sajobi, Tolulope T.
Sawatzky, Richard
Bohm, Eric
Lix, Lisa M.
[J]. HEALTH AND QUALITY OF LIFE OUTCOMES, 2019, 17 (1)
[6] Ensemble feature learning for material recognition with convolutional neural networks
Bian, Peng
Li, Wanwan
Jin, Yi
Zhi, Ruicong
[J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2018,
[7] Multiple Imputation for Missing Data via Sequential Regression Trees
Burgette, Lane F.
Reiter, Jerome P.
[J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2010, 172 (09) : 1070 - 1076
[8] Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments
Celton, Magalie
Malpertuy, Alain
Lelandais, Gaelle
de Brevern, Alexandre G.
[J]. BMC GENOMICS, 2010, 11
[9] Chen C., 2015, J COMPUTER COMMUNICA, V3, P1
[10] Ensemble correlation-based low-rank matrix completion with applications to traffic data imputation
Chen, Xiaobo
Wei, Zhongjie
Li, Zuoyong
Liang, Jun
Cai, Yingfeng
Zhang, Bob
[J]. KNOWLEDGE-BASED SYSTEMS, 2017, 132 : 249 - 262

← 1 2 3 4 5 →