Selection of important variables by statistical learning in genome-wide association analysis

被引:0
|
作者
Wei (Will) Yang
C Charles Gu
机构
[1] Washington University School of Medicine,Division of Biostatistics
[2] Washington University School of Medicine,Department of Genetics
关键词
Bayesian Network; Random Forest; Coronary Artery Calcification; Risk SNPs; Random Forest Analysis;
D O I
10.1186/1753-6561-3-S7-S70
中图分类号
学科分类号
摘要
Genetic analysis of complex diseases demands novel analytical methods to interpret data collected on thousands of variables by genome-wide association studies. The complexity of such analysis is multiplied when one has to consider interaction effects, be they among the genetic variations (G × G) or with environment risk factors (G × E). Several statistical learning methods seem quite promising in this context. Herein we consider applications of two such methods, random forest and Bayesian networks, to the simulated dataset for Genetic Analysis Workshop 16 Problem 3. Our evaluation study showed that an iterative search based on the random forest approach has the potential in selecting important variables, while Bayesian networks can capture some of the underlying causal relationships.
引用
收藏
相关论文
共 50 条
  • [1] Statistical analysis for genome-wide association study
    Ping Zeng
    Yang Zhao
    Cheng Qian
    Liwei Zhang
    Ruyang Zhang
    Jianwei Gou
    Jin Liu
    Liya Liu
    Feng Chen
    The Journal of Biomedical Research, 2015, 29 (04) : 285 - 297
  • [2] Statistical analysis for genome-wide association study
    Zeng, Ping
    Zhao, Yang
    Qian, Cheng
    Zhang, Liwei
    Zhang, Ruyang
    Gou, Jianwei
    Liu, Jin
    Liu, Liya
    Chen, Feng
    JOURNAL OF BIOMEDICAL RESEARCH, 2015, 29 (04): : 285 - 297
  • [3] Statistical selection of biological models for genome-wide association analyses
    Bi, Wenjian
    Kang, Guolian
    Pounds, Stanley B.
    METHODS, 2018, 145 : 67 - 75
  • [4] Statistical Selection of Biological Models for Genome-Wide Association Analyses
    Bi, Wenjian
    Kang, Guolian
    Pounds, Stanley B.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 150 - 157
  • [5] Statistical Power of Model Selection Strategies for Genome-Wide Association Studies
    Wu, Zheyang
    Zhao, Hongyu
    PLOS GENETICS, 2009, 5 (07):
  • [6] Statistical methods adopted in genome-wide association study and genomic selection
    Hayashi, Takeshi
    GENES & GENETIC SYSTEMS, 2011, 86 (06) : 393 - 393
  • [7] Stability Selection for Genome-Wide Association
    Alexander, David H.
    Lange, Kenneth
    GENETIC EPIDEMIOLOGY, 2011, 35 (07) : 722 - 728
  • [8] Statistical methods for genome-wide association studies
    Wang, Maggie Haitian
    Cordell, Heather J.
    Van Steen, Kristel
    SEMINARS IN CANCER BIOLOGY, 2019, 55 : 53 - 60
  • [9] Statistical Methods in Genome-Wide Association Studies
    Sun, Ning
    Zhao, Hongyu
    ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE, VOL 3, 2020, 2020, 3 : 265 - 288
  • [10] Statistical approaches for genome-wide association studies
    Balding, D.
    EJC SUPPLEMENTS, 2008, 6 (09): : 187 - 187