An Integrated Statistical Approach to Compare Transcriptomics Data Across Experiments: A Case Study on the Identification of Candidate Target Genes of the Transcription Factor PPAR alpha

被引:0
作者
Ullah, Mohammad Ohid [1 ]
Muller, Michael [1 ,2 ]
Hooiveld, Guido J. E. J. [1 ,2 ]
机构
[1] Wageningen Univ, Div Human Nutr, Nutr Metab & Genom Grp, Wageningen, Netherlands
[2] Netherlands Nutrigenom Ctr, TI Food & Nutr, Wageningen, Netherlands
关键词
false discovery rate; gene expression; transcriptomics; microarray; transcription factor; peroxisome proliferator-activated receptor alpha; ANOVA;
D O I
10.4137/BBI.S9529
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
An effective strategy to elucidate the signal transduction cascades activated by a transcription factor is to compare the transcriptional profiles of wild type and transcription factor knockout models. Many statistical tests have been proposed for analyzing gene expression data, but most tests are based on pair-wise comparisons. Since the analysis of microarrays involves the testing of multiple hypotheses within one study, it is generally accepted that one should control for false positives by the false discovery rate (FDR). However, it has been reported that this may be an inappropriate metric for comparing data across different experiments. Here we propose an approach that addresses the above mentioned problem by the simultaneous testing and integration of the three hypotheses (contrasts) using the cell means ANOVA model. These three contrasts test for the effect of a treatment in wild type, gene knockout, and globally over all experimental groups. We illustrate our approach on microarray experiments that focused on the identification of candidate target genes and biological processes governed by the fatty acid sensing transcription factor PPAR alpha in liver. Compared to the often applied FDR based across experiment comparison, our approach identified a conservative but less noisy set of candidate genes with same sensitivity and specificity. However, our method had the advantage of properly adjusting for multiple testing while integrating data from two experiments, and was driven by biological inference. Taken together, in this study we present a simple, yet efficient strategy to compare differential expression of genes across experiments while controlling for multiple hypothesis testing.
引用
收藏
页码:145 / 154
页数:10
相关论文
共 38 条
[11]   Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)
[12]   Overview of nomenclature of nuclear receptors [J].
Germain, Pierre ;
Staels, Bart ;
Dacquet, Catherine ;
Spedding, Michael ;
Laudet, Vincent .
PHARMACOLOGICAL REVIEWS, 2006, 58 (04) :685-704
[13]  
Gorte M, 2011, METHODS MOL BIOL, V754, P119, DOI 10.1007/978-1-61779-154-3_7
[14]  
Gregory BD, 2009, METHODS MOL BIOL, V553, P39, DOI 10.1007/978-1-60327-563-7_3
[15]   Improved detection of overrepresentation of Gene-Ontology annotations with parentchild analysis [J].
Grossmann, Steffen ;
Bauer, Sebastian ;
Robinson, Peter N. ;
Vingron, Martin .
BIOINFORMATICS, 2007, 23 (22) :3024-3031
[16]  
Hahne F, 2008, USE R, P89, DOI 10.1007/978-0-387-77240-0_7
[17]   A note on the false discovery rate and inconsistent comparisons between experiments [J].
Higdon, Roger ;
van Belle, Gerald ;
Kolker, Eugene .
BIOINFORMATICS, 2008, 24 (10) :1225-1228
[18]  
Ihaka R., 1996, J COMPUT GR STAT, V5, P299, DOI 10.2307/1390807
[19]  
ISSEMANN I, 1990, NATURE, V347, P645, DOI 10.1038/347645a0
[20]   Analysis of variance for gene expression microarray data [J].
Kerr, MK ;
Martin, M ;
Churchill, GA .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (06) :819-837