Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits

被引:223
作者
MacLeod, I. M. [1 ,2 ,3 ]
Bowman, P. J. [2 ,3 ,4 ]
Vander Jagt, C. J. [2 ,3 ]
Haile-Mariam, M. [2 ,3 ]
Kemper, K. E. [1 ,3 ]
Chamberlain, A. J. [2 ,3 ]
Schrooten, C. [5 ]
Hayes, B. J. [2 ,3 ,4 ]
Goddard, M. E. [1 ,2 ,3 ]
机构
[1] Univ Melbourne, Fac Vet & Agr Sci, Melbourne, Vic 3010, Australia
[2] AgriBio, Dairy Futures Cooperat Res Ctr, Bundoora, Vic, Australia
[3] AgriBio, Dept Econ Dev Jobs Transport & Resources, Bundoora, Vic, Australia
[4] La Trobe Univ, Biosci Res Ctr, Bundoora, Vic 3086, Australia
[5] CRV, NL-6800 AL Arnhem, Netherlands
关键词
Bayesian analysis; Biological model; Genomic selection; Whole-genome association analysis; Milk traits; Dairy cattle; MILK PROTEIN POLYMORPHISMS; BOVINE BETA-LACTOGLOBULIN; FATTY-ACID-COMPOSITION; DAIRY-CATTLE; WIDE ASSOCIATION; MAMMARY-GLAND; HOLSTEIN CATTLE; HUMAN HEIGHT; RECEPTOR; GENE;
D O I
10.1186/s12864-016-2443-6
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Dense SNP genotypes are often combined with complex trait phenotypes to map causal variants, study genetic architecture and provide genomic predictions for individuals with genotypes but no phenotype. A single method of analysis that jointly fits all genotypes in a Bayesian mixture model (BayesR) has been shown to competitively address all 3 purposes simultaneously. However, BayesR and other similar methods ignore prior biological knowledge and assume all genotypes are equally likely to affect the trait. While this assumption is reasonable for SNP array genotypes, it is less sensible if genotypes are whole-genome sequence variants which should include causal variants. Results: We introduce a new method (BayesRC) based on BayesR that incorporates prior biological information in the analysis by defining classes of variants likely to be enriched for causal mutations. The information can be derived from a range of sources, including variant annotation, candidate gene lists and known causal variants. This information is then incorporated objectively in the analysis based on evidence of enrichment in the data. We demonstrate the increased power of BayesRC compared to BayesR using real dairy cattle genotypes with simulated phenotypes. The genotypes were imputed whole-genome sequence variants in coding regions combined with dense SNP markers. BayesRC increased the power to detect causal variants and increased the accuracy of genomic prediction. The relative improvement for genomic prediction was most apparent in validation populations that were not closely related to the reference population. We also applied BayesRC to real milk production phenotypes in dairy cattle using independent biological priors from gene expression analyses. Although current biological knowledge of which genes and variants affect milk production is still very incomplete, our results suggest that the new BayesRC method was equal to or more powerful than BayesR for detecting candidate causal variants and for genomic prediction of milk traits. Conclusions: BayesRC provides a novel and flexible approach to simultaneously improving the accuracy of QTL discovery and genomic prediction by taking advantage of prior biological knowledge. Approaches such as BayesRC will become increasing useful as biological knowledge accumulates regarding functional regions of the genome for a range of traits and species.
引用
收藏
页数:21
相关论文
共 55 条
[1]  
Alexander SPH, 2011, BRIT J PHARMACOL, V164, pS189, DOI [10.1111/j.1476-5381.2011.01649_7.x, DOI 10.1111/J.1476-5381.2011.01649_7X]
[2]   GENETICS OF THE BETA-LACTOGLOBULINS OF COWS MILK [J].
ASCHAFFENBURG, R ;
DREWRY, J .
NATURE, 1957, 180 (4582) :376-378
[3]   Gene networks driving bovine milk fat synthesis during the lactation cycle [J].
Bionaz, Massimo ;
Loor, Juan J. .
BMC GENOMICS, 2008, 9 (1)
[4]  
Blott S, 2003, GENETICS, V163, P253
[5]   Aberrant low expression level of bovine β-lactoglobulin is associated with a C to A transversion in the BLG promoter region [J].
Braunschweig, M. H. ;
Leeb, T. .
JOURNAL OF DAIRY SCIENCE, 2006, 89 (11) :4414-4419
[6]   Genome position specific priors for genomic prediction [J].
Brondum, Rasmus Froberg ;
Su, Guosheng ;
Lund, Mogens Sando ;
Bowman, Philip J. ;
Goddard, Michael E. ;
Hayes, Benjamin J. .
BMC GENOMICS, 2012, 13
[7]   A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals [J].
Browning, Brian L. ;
Browning, Sharon R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 84 (02) :210-223
[8]   Genome-wide association and biological pathway analysis for milk-fat composition in Danish Holstein and Danish Jersey cattle [J].
Buitenhuis, Bart ;
Janss, Luc L. G. ;
Poulsen, Nina A. ;
Larsen, Lotte B. ;
Larsen, Mette K. ;
Sorensen, Peter .
BMC GENOMICS, 2014, 15
[9]   The effect of recombinant bovine placental lactogen on induced lactation in dairy heifers [J].
Byatt, JC ;
Sorbet, RH ;
Eppard, PJ ;
Curran, TL ;
Curran, DF ;
Collier, RJ .
JOURNAL OF DAIRY SCIENCE, 1997, 80 (03) :496-503
[10]   Invited review: Milk protein polymorphisms in cattle: Effect on animal breeding and human nutrition [J].
Caroli, A. M. ;
Chessa, S. ;
Erhardt, G. J. .
JOURNAL OF DAIRY SCIENCE, 2009, 92 (11) :5335-5352