A Two-Stage Random Forest-Based Pathway Analysis Method

被引:20
作者
Chung, Ren-Hua [1 ,2 ]
Chen, Ying-Erh [3 ]
机构
[1] Natl Hlth Res Inst, Inst Populat Hlth Sci, Div Biostat & Bioinformat, Zhunan, Miaoli, Taiwan
[2] Univ Miami, Miller Sch Med, John P Hussman Inst Human Gen, Ctr Genet Epidemiol & Stat Genet, Miami, FL 33136 USA
[3] N Carolina State Univ, Dept Econ, Raleigh, NC 27695 USA
关键词
GENE-GENE; ASSOCIATION; SUSCEPTIBILITY; POLYMORPHISMS; GENOTYPES; RISK; MDM2; SNPS;
D O I
10.1371/journal.pone.0036662
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Pathway analysis provides a powerful approach for identifying the joint effect of genes grouped into biologically-based pathways on disease. Pathway analysis is also an attractive approach for a secondary analysis of genome-wide association study (GWAS) data that may still yield new results from these valuable datasets. Most of the current pathway analysis methods focused on testing the cumulative main effects of genes in a pathway. However, for complex diseases, gene-gene interactions are expected to play a critical role in disease etiology. We extended a random forest-based method for pathway analysis by incorporating a two-stage design. We used simulations to verify that the proposed method has the correct type I error rates. We also used simulations to show that the method is more powerful than the original random forest-based pathway approach and the set-based test implemented in PLINK in the presence of gene-gene interactions. Finally, we applied the method to a breast cancer GWAS dataset and a lung cancer GWAS dataset and interesting pathways were identified that have implications for breast and lung cancers.
引用
收藏
页数:6
相关论文
共 27 条
[1]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[2]   Identifying SNPs predictive of phenotype using random forests [J].
Bureau, A ;
Dupuis, J ;
Falls, K ;
Lunetta, KL ;
Hayward, B ;
Keith, TP ;
Van Eerdewegh, P .
GENETIC EPIDEMIOLOGY, 2005, 28 (02) :171-182
[3]   The American Cancer Society cancer prevention study II nutrition cohort - Rationale, study design, and baseline characteristics [J].
Calle, EE ;
Rodriguez, C ;
Jacobs, EJ ;
Almon, ML ;
Chao, A ;
McCullough, ML ;
Feigelson, HS ;
Thun, MJ .
CANCER, 2002, 94 (09) :2490-2501
[4]   Pathway analysis of single-nucleotide polymorphisms potentially associated with glioblastoma multiforme susceptibility using random forests [J].
Chang, Jeffrey S. ;
Yeh, Ru-Fang ;
Wiencke, John K. ;
Wiemels, Joseph L. ;
Smirnov, Ivan ;
Pico, Alexander R. ;
Tihan, Tarik ;
Patoka, Joe ;
Miike, Rei ;
Sison, Jennette D. ;
Rice, Terri ;
Wrensch, Margaret R. .
CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2008, 17 (06) :1368-1373
[5]   Powerful multilocus tests of genetic association in the presence of gene-gene and gene-environment interactions [J].
Chatterjee, Nilanjan ;
Kalaylioglu, Zeynep ;
Moslehi, Roxana ;
Peters, Ulrike ;
Wacholder, Sholom .
AMERICAN JOURNAL OF HUMAN GENETICS, 2006, 79 (06) :1002-1016
[6]   Insights into Colon Cancer Etiology via a Regularized Approach to Gene Set Analysis of GWAS Data [J].
Chen, Lin S. ;
Hutter, Carolyn M. ;
Potter, John D. ;
Liu, Yan ;
Prentice, Ross L. ;
Peters, Ulrike ;
Hsu, Li .
AMERICAN JOURNAL OF HUMAN GENETICS, 2010, 86 (06) :860-871
[7]   Detecting gene-gene interactions that underlie human diseases [J].
Cordell, Heather J. .
NATURE REVIEWS GENETICS, 2009, 10 (06) :392-404
[8]   A screening methodology based on Random Forests to improve the detection of gene-gene interactions [J].
De Lobel, Lizzy ;
Geurts, Pierre ;
Baele, Guy ;
Castro-Giner, Francesc ;
Kogevinas, Manolis ;
Van Steen, Kristel .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2010, 18 (10) :1127-1132
[9]  
Edwards TL, 2008, LECT NOTES COMPUT SC, V4973, P24, DOI 10.1007/978-3-540-78757-0_3
[10]   Pathway Analysis of GWAS Provides New Insights into Genetic Susceptibility to 3 Inflammatory Diseases [J].
Eleftherohorinou, Hariklia ;
Wright, Victoria ;
Hoggart, Clive ;
Hartikainen, Anna-Liisa ;
Jarvelin, Marjo-Riitta ;
Balding, David ;
Coin, Lachlan ;
Levin, Michael .
PLOS ONE, 2009, 4 (11)