A Review on Missing Value Imputation Algorithms for Microarray Gene Expression Data

被引:47
作者
Moorthy, Kohbalan [1 ]
Mohamad, Mohd Saberi [1 ]
Deris, Safaai [1 ]
机构
[1] Univ Technol Malaysia, Fac Comp, Artificial Intelligence & Bioinformat Res Grp, Skudai 81310, Johor, Malaysia
关键词
Gene expression analysis; gene expression data; information recovery; microarray data; missing value estimation; missing value imputation;
D O I
10.2174/1574893608999140109120957
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Many bioinformatics analytical tools, especially for cancer classification and prediction, require complete sets of data matrix. Having missing values in gene expression studies significantly influences the interpretation of final data. However, to most analysts' dismay, this has become a common problem and thus, relevant missing value imputation algorithms have to be developed and/or refined to address this matter. This paper intends to present a review of preferred and available missing value imputation methods for the analysis and imputation of missing values in gene expression data. Focus is placed on the abilities of algorithms in performing local or global data correlation to estimate the missing values. Approaches of the algorithms mentioned have been categorized into global approach, local approach, hybrid approach, and knowledge assisted approach. The methods presented are accompanied with suitable performance evaluation. The aim of this review is to highlight possible improvements on existing research techniques, rather than recommending new algorithms with the same functional aim.
引用
收藏
页码:18 / 22
页数:5
相关论文
共 25 条
[1]  
Acuña E, 2004, ST CLASS DAT ANAL, P639
[2]   Dealing with missing values in large-scale studies: microarray data imputation and beyond [J].
Aittokallio, Tero .
BRIEFINGS IN BIOINFORMATICS, 2010, 11 (02) :253-264
[3]  
[Anonymous], P 2010 ACM S APPL CO
[4]   Gene expression profile classification: A review [J].
Asyali, Musa H. ;
Colak, Dilek ;
Demirkaya, Omer ;
Inan, Mehmet S. .
CURRENT BIOINFORMATICS, 2006, 1 (01) :55-73
[5]   Multiple Imputation for Missing Data via Sequential Regression Trees [J].
Burgette, Lane F. ;
Reiter, Jerome P. .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2010, 172 (09) :1070-1076
[6]   Review: A gentle introduction to imputation of missing values [J].
Donders, A. Rogier T. ;
van der Heijden, Geert J. M. G. ;
Stijnen, Theo ;
Moons, Karel G. M. .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2006, 59 (10) :1087-1091
[7]   Microarray missing data imputation based on a set theoretic framework and biological knowledge [J].
Gan, XC ;
Liew, AWC ;
Yan, H .
NUCLEIC ACIDS RESEARCH, 2006, 34 (05) :1608-1619
[8]   From molecular to modular cell biology [J].
Hartwell, LH ;
Hopfield, JJ ;
Leibler, S ;
Murray, AW .
NATURE, 1999, 402 (6761) :C47-C52
[9]   DNA microarray data imputation and significance analysis of differential expression [J].
Jörnsten, R ;
Wang, HY ;
Welsh, WJ ;
Ouyang, M .
BIOINFORMATICS, 2005, 21 (22) :4155-4161
[10]   Missing value estimation for DNA microarray gene expression data: local least squares imputation [J].
Kim, H ;
Golub, GH ;
Park, H .
BIOINFORMATICS, 2005, 21 (02) :187-198