Improving GWAS discovery and genomic prediction accuracy in biobank data

被引:13
|
作者
Orliac, Etienne J. [1 ]
Banos, Daniel Trejo [2 ]
Ojavee, Sven E. [3 ]
Lall, Kristi [4 ]
Magi, Reedik [4 ]
Visscher, Peter M. [5 ]
Robinson, Matthew R. [6 ]
机构
[1] Univ Lausanne, Sci Comp & Res Support Unit, CH-1015 Lausanne, Switzerland
[2] Univ Zurich, Dept Quantitat Biomed, CH-8057 Zurich, Switzerland
[3] Univ Lausanne, Dept Computat Biol, CH-1015 Lausanne, Switzerland
[4] Univ Tartu, Inst Genom, Estonian Genome Ctr, EE-51010 Tartu, Estonia
[5] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
[6] IST Austria, A-3400 Klosterneuburg, Austria
基金
瑞士国家科学基金会; 澳大利亚研究理事会; 英国医学研究理事会;
关键词
genomic prediction; association study; Bayesian penalized regression; RESOURCE;
D O I
10.1073/pnas.2121279119
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genetically informed, deep-phenotyped biobanks are an important research resource and it is imperative that the most powerful, versatile, and efficient analysis approaches are used. Here, we apply our recently developed Bayesian grouped mixture of regressions model (GMRM) in the UK and Estonian Biobanks and obtain the highest genomic prediction accuracy reported to date across 21 heritable traits. When compared to other approaches, GMRM accuracy was greater than annotation prediction models run in the LDAK or LDPred-funct software by 15% (SE 7%) and 14% (SE 2%), respectively, and was 18% (SE 3%) greater than a baseline BayesR model without single-nucleotide polymorphism (SNP) markers grouped into minor allele frequency-linkage disequilibrium (MAF-LD) annotation categories. For height, the prediction accuracy R-2 was 47% in a UK Biobank holdout sample, which was 76% of the estimated h(2) SNP. We then extend our GMRM prediction model to provide mixed-linear model association (MLMA) SNP marker estimates for genome-wide association (GWAS) discovery, which increased the independent loci detected to 16,162 in unrelated UK Biobank individuals, compared to 10,550 from BoltLMM and 10,095 from Regenie, a 62 and 65% increase, respectively. Theaverage chi(2) value of the leading markers increased by 15.24 (SE 0.41) for every 1% increase in prediction accuracy gained over a baseline BayesR model across the traits. Thus, we show that modeling genetic associations accounting for MAF and LD differences among SNP markers, and incorporating prior knowledge of genomic function, is important for both genomic prediction and discovery in large-scale individual-level studies.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein–Friesian cattle
    Roel F. Veerkamp
    Aniek C. Bouwman
    Chris Schrooten
    Mario P. L. Calus
    Genetics Selection Evolution, 48
  • [22] Genomic Prediction Models for Count Data
    Montesinos-Lopez, Osval A.
    Montesinos-Lopez, Abelardo
    Perez-Rodriguez, Paulino
    Eskridge, Kent
    He, Xinyao
    Juliana, Philomin
    Singh, Pawan
    Crossa, Jose
    JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2015, 20 (04) : 533 - 554
  • [23] Genomic Prediction Models for Count Data
    Osval A. Montesinos-López
    Abelardo Montesinos-López
    Paulino Pérez-Rodríguez
    Kent Eskridge
    Xinyao He
    Philomin Juliana
    Pawan Singh
    José Crossa
    Journal of Agricultural, Biological, and Environmental Statistics, 2015, 20 : 533 - 554
  • [24] Improving Genomic Prediction of Crossbred and Purebred Dairy Cattle
    Khansefid, Majid
    Goddard, Michael E.
    Haile-Mariam, Mekonnen
    Konstantinov, Kon, V
    Schrooten, Chris
    de Jong, Gerben
    Jewell, Erica G.
    O'Connor, Erin
    Pryce, Jennie E.
    Daetwyler, Hans D.
    MacLeod, Iona M.
    FRONTIERS IN GENETICS, 2020, 11
  • [25] The effect of marker types and density on genomic prediction and GWAS of key performance traits in tetraploid potato
    Aalborg, Trine
    Sverrisdottir, Elsa
    Kristensen, Heidi Thorgaard
    Nielsen, Kare Lehmann
    FRONTIERS IN PLANT SCIENCE, 2024, 15
  • [26] Accuracy of genomic prediction of shell quality in a White Leghorn line
    Wolc, A.
    Drobik-Czwarno, W.
    Jankowski, T.
    Arango, J.
    Settar, P.
    Fulton, J. E.
    Fernando, R. L.
    Garrick, D. J.
    Dekkers, J. C. M.
    POULTRY SCIENCE, 2020, 99 (06) : 2833 - 2840
  • [27] Accuracy of Genomic Prediction for Foliar Terpene Traits in Eucalyptus polybractea
    Kainer, David
    Stone, Eric A.
    Padovan, Amanda
    Foley, William J.
    Kulheim, Carsten
    G3-GENES GENOMES GENETICS, 2018, 8 (08): : 2573 - 2583
  • [28] Improving the accuracy of genomic evaluation for linear body measurement traits using single-step genomic best linear unbiased prediction in Hanwoo beef cattle
    Naserkheil, Masoumeh
    Lee, Deuk Hwan
    Mehrban, Hossein
    BMC GENETICS, 2020, 21 (01)
  • [29] Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding
    Estaghvirou, Sidi Boubacar Ould
    Ogutu, Joseph O.
    Piepho, Hans-Peter
    G3-GENES GENOMES GENETICS, 2014, 4 (12): : 2317 - 2328
  • [30] Improving the accuracy of genomic evaluation for linear body measurement traits using single-step genomic best linear unbiased prediction in Hanwoo beef cattle
    Masoumeh Naserkheil
    Deuk Hwan Lee
    Hossein Mehrban
    BMC Genetics, 21