Autoregressive-Model-Based Missing Value Estimation for DNA Microarray Time Series Data

被引:54
作者
Choong, Miew Keen [1 ]
Charbit, Maurice [2 ]
Yan, Hong [1 ,3 ]
机构
[1] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
[2] Ecole Natl Super Telecommun Bretagne, Dept Signal & Image Proc, F-75634 Paris, France
[3] City Univ Hong Kong, Dept Elect Engn, Kowloon, Hong Kong, Peoples R China
来源
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE | 2009年 / 13卷 / 01期
关键词
Autoregressive (AR) model; microarray data analysis; missing value estimation; time series analysis; CELL-CYCLE; EXPRESSION; IMPUTATION; GENES;
D O I
10.1109/TITB.2008.2007421
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing value estimation is important in DNA microarray data analysis. A number of algorithms have been developed to solve this problem, but they have several limitations. Most existing algorithms are not able to deal with the situation where a particular time point (column) of the data is missing entirely. In this paper, we present an autoregressive-model-based missing value estimation method (ARLSimpute) that takes into account the dynamic property of microarray temporal data and the local similarity structures in the data. ARLSimpute is especially effective for the situation where a particular time point contains many missing values or where the entire time point is missing. Experiment results suggest that our proposed algorithm is an accurate missing value estimator in comparison with other imputation methods on simulated as well as real microarray time series datasets.
引用
收藏
页码:131 / 137
页数:7
相关论文
共 19 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]   Finite sample criteria for autoregressive order selection [J].
Broersen, PMT .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2000, 48 (12) :3550-3558
[3]  
CHOONG MK, 2009, INT J DATA MINING BI
[4]   A simultaneous reconstruction of missing data in DNA microarrays [J].
Friedland, Shmuel ;
Niknejad, Amir ;
Chihara, Laura .
LINEAR ALGEBRA AND ITS APPLICATIONS, 2006, 416 (01) :8-28
[5]   Microarray missing data imputation based on a set theoretic framework and biological knowledge [J].
Gan, XC ;
Liew, AWC ;
Yan, H .
NUCLEIC ACIDS RESEARCH, 2006, 34 (05) :1608-1619
[6]   Missing value estimation for DNA microarray gene expression data: local least squares imputation [J].
Kim, H ;
Golub, GH ;
Park, H .
BIOINFORMATICS, 2005, 21 (02) :187-198
[7]   Comparative whole genome transcriptome analysis of three Plasmodium falciparum strains [J].
Llinás, M ;
Bozdech, Z ;
Wong, ED ;
Adai, AT ;
DeRisi, JL .
NUCLEIC ACIDS RESEARCH, 2006, 34 (04) :1166-1173
[8]   A Bayesian missing value estimation method for gene expression profile data [J].
Oba, S ;
Sato, M ;
Takemasa, I ;
Monden, M ;
Matsubara, K ;
Ishii, S .
BIOINFORMATICS, 2003, 19 (16) :2088-2096
[9]   The cell cycle-regulated genes of Schizosaccharomyces pombe [J].
Oliva, A ;
Rosebrock, A ;
Ferrezuelo, F ;
Pyne, S ;
Chen, HY ;
Skiena, S ;
Futcher, B ;
Leatherwood, J .
PLOS BIOLOGY, 2005, 3 (07) :1239-1260
[10]  
Press W., 1992, Numerical Recipes in C: The Art of Scientific Computing, V2nd