Prediction of regulatory gene pairs using dynamic time warping and gene ontology

被引:3
作者
Yang, Andy C. [1 ]
Hsu, Hui-Huang [1 ]
Lu, Ming-Da [1 ]
Tseng, Vincent S. [2 ]
Shih, Timothy K. [3 ]
机构
[1] Tamkang Univ, Dept Comp Sci & Informat Engn, New Taipei City, Taiwan
[2] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
[3] Natl Cent Univ, Dept Comp Sci & Informat Engn, Taoyuan, Taiwan
关键词
microarray time series data; missing value imputation; gene regulation prediction; DTW; dynamic time warping; gene ontology; MISSING VALUE ESTIMATION; MICROARRAY DATA; EXPRESSION; ALGORITHMS; IMPUTATION; NETWORKS;
D O I
10.1504/IJDMB.2014.064010
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.
引用
收藏
页码:121 / 145
页数:25
相关论文
共 41 条
[1]  
Acuña E, 2004, ST CLASS DAT ANAL, P639
[2]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[3]  
Berndt D. J., 1994, AAAIWS 94 P 3 INT C, P359
[4]  
Chen Li, 2008, 2008 IEEE International Conference on Web Services (ICWS), P45, DOI 10.1109/ICWS.2008.13
[5]   A genome-wide transcriptional analysis of the mitotic cell cycle [J].
Cho, RJ ;
Campbell, MJ ;
Winzeler, EA ;
Steinmetz, L ;
Conway, A ;
Wodicka, L ;
Wolfsberg, TG ;
Gabrielian, AE ;
Landsman, D ;
Lockhart, DJ ;
Davis, RW .
MOLECULAR CELL, 1998, 2 (01) :65-73
[6]   Study of microarray time series data based on Forward-Backward Linear Prediction and Singular Value Decomposition [J].
Choong, Miew Keen ;
Levy, David ;
Yan, Hong .
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2009, 3 (02) :145-159
[7]   Autoregressive-Model-Based Missing Value Estimation for DNA Microarray Time Series Data [J].
Choong, Miew Keen ;
Charbit, Maurice ;
Yan, Hong .
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2009, 13 (01) :131-137
[8]   Exploring the metabolic and genetic control of gene expression on a genomic scale [J].
DeRisi, JL ;
Iyer, VR ;
Brown, PO .
SCIENCE, 1997, 278 (5338) :680-686
[9]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[10]  
Filkov V., 2001, Proc. of the 5th Ann.Intl. Conf. on Comput. Biol, P124