Integrated genomic analysis of biological gene sets with applications in lung cancer prognosis

被引:8
作者
Chu, Su Hee [1 ,4 ]
Huang, Yen-Tsung [1 ,2 ,3 ]
机构
[1] Brown Univ, Sch Publ Hlth, Dept Epidemiol, 121 S Main St, Providence, RI 02912 USA
[2] Brown Univ, Sch Publ Hlth, Dept Biostat, 121 S Main St, Providence, RI 02912 USA
[3] Acad Sinica, Inst Stat Sci, 128,Sect 2,Acad Rd, Taipei, Taiwan
[4] Harvard Med Sch, Brigham & Womens Hosp, Channing Div Network Med, 181 Longwood Ave, Boston, MA USA
关键词
Pathway analysis; Data integration; Epigenetics; Gene expression; Gene set analysis; Integrative genomics; EXPRESSION DATA; WIDE ASSOCIATION; MIXED MODELS; DISCOVERY; KNOWLEDGE; FRAMEWORK; INFERENCE; PATHWAY;
D O I
10.1186/s12859-017-1737-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Burgeoning interest in integrative analyses has produced a rise in studies which incorporate data from multiple genomic platforms. Literature for conducting formal hypothesis testing on an integrative gene set level is considerably sparse. This paper is biologically motivated by our interest in the joint effects of epigenetic methylation loci and their associated mRNA gene expressions on lung cancer survival status. Results: We provide an efficient screening approach across multiplatform genomic data on the level of biologically related sets of genes, and our methods are applicable to various disease models regardless whether the underlying true model is known (iTEGS) or unknown (iNOTE). Our proposed testing procedure dominated two competing methods. Using our methods, we identified a total of 28 gene sets with significant joint epigenomic and transcriptomic effects on one-year lung cancer survival. Conclusions: We propose efficient variance component-based testing procedures to facilitate the joint testing of multiplatform genomic data across an entire gene set. The testing procedure for the gene set is self-contained, and can easily be extended to include more or different genetic platforms. iTEGS and iNOTE implemented in R are freely available through the inote package at https://cran.r-project.org//.
引用
收藏
页数:13
相关论文
共 30 条
[1]  
Badea L., 2008, P PAC S BIOC, V290, P279
[2]   APPROXIMATE INFERENCE IN GENERALIZED LINEAR MIXED MODELS [J].
BRESLOW, NE ;
CLAYTON, DG .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) :9-25
[3]   Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test [J].
Cai, Tianxi ;
Lin, Xihong ;
Carroll, Raymond J. .
BIOSTATISTICS, 2012, 13 (04) :776-790
[4]  
DAVIES RB, 1973, BIOMETRIKA, V60, P415, DOI 10.1093/biomet/60.2.415
[5]   Regulation of Id1 expression by Src: Implications for targeting of the bone morphogenetic protein pathway in cancer [J].
Gautschi, Oliver ;
Tepper, Clifford G. ;
Purnell, Phillip R. ;
Izumiya, Yoshihiro ;
Evans, Christopher P. ;
Green, Tim P. ;
Desprez, Pierre Y. ;
Lara, Primo N. ;
Gandara, David R. ;
Mack, Philip C. ;
Kung, Hsing-Jien .
CANCER RESEARCH, 2008, 68 (07) :2250-2258
[6]   Analyzing gene expression data in terms of gene sets:: methodological issues [J].
Goeman, Jelle J. ;
Buehlmann, Peter .
BIOINFORMATICS, 2007, 23 (08) :980-987
[7]   A global test for groups of genes: testing association with a clinical outcome [J].
Goeman, JJ ;
van de Geer, SA ;
de Kort, F ;
van Houwelingen, HC .
BIOINFORMATICS, 2004, 20 (01) :93-99
[8]   iGWAS: Integrative Genome-Wide Association Studies of Genetic and Genomic Data for Disease Susceptibility Using Mediation Analysis [J].
Huang, Yen-Tsung ;
Liang, Liming ;
Moffatt, Miriam F. ;
Cookson, William O. C. M. ;
Lin, Xihong .
GENETIC EPIDEMIOLOGY, 2015, 39 (05) :347-356
[9]   JOINT ANALYSIS OF SNP AND GENE EXPRESSION DATA IN GENETIC ASSOCIATION STUDIES OF COMPLEX DISEASES [J].
Huang, Yen-Tsung ;
VanderWeele, Tyler J. ;
Lin, Xihong .
ANNALS OF APPLIED STATISTICS, 2014, 8 (01) :352-376
[10]   Gene set analysis using variance component tests [J].
Huang, Yen-Tsung ;
Lin, Xihong .
BMC BIOINFORMATICS, 2013, 14