Selection of important variables by statistical learning in genome-wide association analysis

被引:0
|
作者
Wei (Will) Yang
C Charles Gu
机构
[1] Washington University School of Medicine,Division of Biostatistics
[2] Washington University School of Medicine,Department of Genetics
关键词
Bayesian Network; Random Forest; Coronary Artery Calcification; Risk SNPs; Random Forest Analysis;
D O I
10.1186/1753-6561-3-S7-S70
中图分类号
学科分类号
摘要
Genetic analysis of complex diseases demands novel analytical methods to interpret data collected on thousands of variables by genome-wide association studies. The complexity of such analysis is multiplied when one has to consider interaction effects, be they among the genetic variations (G × G) or with environment risk factors (G × E). Several statistical learning methods seem quite promising in this context. Herein we consider applications of two such methods, random forest and Bayesian networks, to the simulated dataset for Genetic Analysis Workshop 16 Problem 3. Our evaluation study showed that an iterative search based on the random forest approach has the potential in selecting important variables, while Bayesian networks can capture some of the underlying causal relationships.
引用
收藏
相关论文
共 50 条
  • [21] Genome-wide Association Analysis for Mixed Design Under Population Stratification in Genome-wide Association
    Won, Sungho
    Laird, Nan
    Lange, Christoph
    GENETIC EPIDEMIOLOGY, 2009, 33 (08) : 764 - 764
  • [22] Machine Learning in Genome-Wide Association Studies
    Szymczak, Silke
    Biernacka, Joanna M.
    Cordell, Heather J.
    Gonzalez-Recio, Oscar
    Koenig, Inke R.
    Zhang, Heping
    Sun, Yan V.
    GENETIC EPIDEMIOLOGY, 2009, 33 : S51 - S57
  • [23] Genome-wide pathway analysis of a genome-wide association study on multiple sclerosis
    Gwan Gyu Song
    Sung Jae Choi
    Jong Dae Ji
    Young Ho Lee
    Molecular Biology Reports, 2013, 40 : 2557 - 2564
  • [24] Genome-wide pathway analysis of a genome-wide association study on multiple sclerosis
    Song, Gwan Gyu
    Choi, Sung Jae
    Ji, Jong Dae
    Lee, Young Ho
    MOLECULAR BIOLOGY REPORTS, 2013, 40 (03) : 2557 - 2564
  • [25] A tutorial on conducting genome-wide association studies: Quality control and statistical analysis
    Marees, Andries T.
    de Kluiver, Hilde
    Stringer, Sven
    Vorspan, Florence
    Curis, Emmanuel
    Marie-Claire, Cynthia
    Derks, Eske M.
    INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2018, 27 (02)
  • [26] Genome-wide association and genomic selection in animal breeding
    Hayes, Ben
    Goddard, Mike
    GENOME, 2010, 53 (11) : 876 - 883
  • [27] Bayesian Variable Selection with Genome-wide Association Studies
    Bangchang, Kannat Na
    LOBACHEVSKII JOURNAL OF MATHEMATICS, 2024, 45 (02) : 613 - 620
  • [28] Model Selection Strategies in Genome-Wide Association Studies
    Keildson, Sarah L.
    Farrall, Martin
    Morris, Andrew P.
    GENETIC EPIDEMIOLOGY, 2009, 33 (08) : 792 - 792
  • [29] A variable selection method for genome-wide association studies
    He, Qianchuan
    Lin, Dan-Yu
    BIOINFORMATICS, 2011, 27 (01) : 1 - 8
  • [30] Spatiotemporal dynamics and genome-wide association genome-wide association analysis of desiccation tolerance in Drosophila melanogaster
    Rajpurohit, Subhash
    Gefen, Eran
    Bergland, Alan O.
    Petrov, Dmitri A.
    Gibbs, Allen G.
    Schmidt, Paul S.
    MOLECULAR ECOLOGY, 2018, 27 (17) : 3525 - 3540