Selection of important variables by statistical learning in genome-wide association analysis

被引:0
|
作者
Wei (Will) Yang
C Charles Gu
机构
[1] Washington University School of Medicine,Division of Biostatistics
[2] Washington University School of Medicine,Department of Genetics
关键词
Bayesian Network; Random Forest; Coronary Artery Calcification; Risk SNPs; Random Forest Analysis;
D O I
10.1186/1753-6561-3-S7-S70
中图分类号
学科分类号
摘要
Genetic analysis of complex diseases demands novel analytical methods to interpret data collected on thousands of variables by genome-wide association studies. The complexity of such analysis is multiplied when one has to consider interaction effects, be they among the genetic variations (G × G) or with environment risk factors (G × E). Several statistical learning methods seem quite promising in this context. Herein we consider applications of two such methods, random forest and Bayesian networks, to the simulated dataset for Genetic Analysis Workshop 16 Problem 3. Our evaluation study showed that an iterative search based on the random forest approach has the potential in selecting important variables, while Bayesian networks can capture some of the underlying causal relationships.
引用
收藏
相关论文
共 50 条
  • [11] Genome-wide association and genomic selection in aquaculture
    Yanez, Jose M.
    Barria, Agustin
    Lopez, Maria E.
    Moen, Thomas
    Garcia, Baltasar F.
    Yoshida, Grazyella M.
    Xu, Peng
    REVIEWS IN AQUACULTURE, 2023, 15 (02) : 645 - 675
  • [12] Revisiting genome-wide association studies from statistical modelling to machine learning
    Sun, Shanwen
    Dong, Benzhi
    Zou, Quan
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
  • [13] Genome-Wide Analysis Reveals Selection for Important Traits in Domestic Horse Breeds
    Petersen, Jessica L.
    Mickelson, James R.
    Rendahl, Aaron K.
    Valberg, Stephanie J.
    Andersson, Lisa S.
    Axelsson, Jeanette
    Bailey, Ernie
    Bannasch, Danika
    Binns, Matthew M.
    Borges, Alexandre S.
    Brama, Pieter
    Machado, Artur da Camara
    Capomaccio, Stefano
    Cappelli, Katia
    Cothran, E. Gus
    Distl, Ottmar
    Fox-Clipsham, Laura
    Graves, Kathryn T.
    Guerin, Gerard
    Haase, Bianca
    Hasegawa, Telhisa
    Hemmann, Karin
    Hill, Emmeline W.
    Leeb, Tosso
    Lindgren, Gabriella
    Lohi, Hannes
    Lopes, Maria Susana
    McGivney, Beatrice A.
    Mikko, Sofia
    Orr, Nicholas
    Penedo, M. Cecilia T.
    Piercy, Richard J.
    Raekallio, Marja
    Rieder, Stefan
    Roed, Knut H.
    Swinburne, June
    Tozaki, Teruaki
    Vaudin, Mark
    Wade, Claire M.
    McCue, Molly E.
    PLOS GENETICS, 2013, 9 (01):
  • [14] Practical Issues in Screening and Variable Selection in Genome-Wide Association Analysis
    Hong, Sungyeon
    Kim, Yongkang
    Park, Taesung
    CANCER INFORMATICS, 2014, 13 : 55 - 65
  • [15] GENOME-WIDE ASSOCIATION AND SELECTION SCANS IN THE PLASMODIUM FALCIPARUM GENOME
    Volkman, Sarah
    Neafsey, Daniel
    Angelino, Elaine
    Schaffner, Steve
    Park, Danny
    Cortese, Joseph
    Barnes, Kayla
    Daniels, Rachel
    Rosen, David
    LaRoux, Michele
    Van Tyne, Daria
    Johnson, Charles
    Sarr, Ousmane
    Mboup, Souleymane
    Milner, Danny, Jr.
    Galagan, James
    Wiegand, Roger
    Hartl, Daniel
    Birren, Bruce
    Lander, Eric
    Wirth, Dyann
    Sabeti, Pardis
    AMERICAN JOURNAL OF TROPICAL MEDICINE AND HYGIENE, 2009, 81 (05): : 108 - 108
  • [16] Enrichment of statistical power for genome-wide association studies
    Li, Meng
    Liu, Xiaolei
    Bradbury, Peter
    Yu, Jianming
    Zhang, Yuan-Ming
    Todhunter, Rory J.
    Buckler, Edward S.
    Zhang, Zhiwu
    BMC BIOLOGY, 2014, 12
  • [17] Enrichment of statistical power for genome-wide association studies
    Meng Li
    Xiaolei Liu
    Peter Bradbury
    Jianming Yu
    Yuan-Ming Zhang
    Rory J Todhunter
    Edward S Buckler
    Zhiwu Zhang
    BMC Biology, 12
  • [18] Statistical genetic issues for genome-wide association studies
    Weir, Bruce S.
    GENOME, 2010, 53 (11) : 869 - 875
  • [19] Towards genome-wide marker assisted breeding: genome-wide association study and genomic selection
    Iwata, Hiroyashi
    GENES & GENETIC SYSTEMS, 2011, 86 (06) : 393 - 393
  • [20] Variable selection in statistical models using population-based incremental learning with applications to genome-wide association studies
    Hien Duy Nguyen
    Wood, Ian A.
    2012 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2012,