An efficient unified model for genome-wide association studies and genomic selection

被引:26
作者
Li, Hengde [1 ,2 ]
Su, Guosheng [3 ]
Jiang, Li [1 ,2 ]
Bao, Zhenmin [4 ]
机构
[1] Chinese Acad Fishery Sci, Minist Agr, Key Lab Aquat Genom, CAFS Key Lab Aquat Genom, Beijing 100141, Peoples R China
[2] Chinese Acad Fishery Sci, Ctr Appl Aquat Genom, Beijing Key Lab Fishery Biotechnol, Beijing 100141, Peoples R China
[3] Aarhus Univ, Dept Mol Biol & Genet, Ctr Quantitat Genet & Genom, DK-8830 Tjele, Denmark
[4] Ocean Univ China, Coll Marine Life, Qingdao 266003, Peoples R China
基金
中国国家自然科学基金;
关键词
QUANTITATIVE TRAIT LOCI; MIXED-MODEL; POPULATION-STRUCTURE; PREDICTION; LASSO; INFORMATION; IMPROVEMENT;
D O I
10.1186/s12711-017-0338-x
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
Background: A quantitative trait is controlled both by major variants with large genetic effects and by minor variants with small effects. Genome-wide association studies (GWAS) are an efficient approach to identify quantitative trait loci (QTL), and genomic selection (GS) with high-density single nucleotide polymorphisms (SNPs) can achieve higher accuracy of estimated breeding values than conventional best linear unbiased prediction (BLUP). GWAS and GS address different aspects of quantitative traits, but, as statistical models, they are quite similar in their description of the genetic mechanisms that underlie quantitative traits. Methods: Here, we propose a stepwise linear regression mixed model (StepLMM) to unify GWAS and GS in a single statistical model. First, the variance components of the genomic-BLUP (GBLUP) model are estimated. Then, in the SNP selection step, the linear mixed model (LMM) for GWAS is equivalently transformed into a simple linear regression to improve computation speed, and the most significant SNP is selected and included into the evaluation model. In the SNP dropping step, the SNPs in the evaluation model are tested according to the standard errors of their estimated effects. If non-significant SNPs are present, the least significant one is dropped from the model and variance components are re-estimated. We used extended Bayesian information criteria (eBIC) to evaluate the model optimization, i.e. the model with the smallest eBIC is the final one and includes only significant SNPs. Results: We simulated scenarios with different heritabilities with 100 QTL. StepLMM estimated heritability accurately and mapped QTL precisely. Genomic prediction accuracy was much higher with StepLMM than with GBLUP. The comparison of StepLMM with other GWAS and GS methods based on a dataset from the 16th QTLMAS Workshop showed that StepLMM had medium mapping power, the lowest rate of false positives for QTL mapping, and the highest accuracy for genomic prediction. Conclusions: StepLMM is a combination of GWAS and GBLUP. GWAS and GBLUP are beneficial to each other in a single statistical model, GWAS improves genomic prediction accuracy, while GBLUP increases mapping precision and decreases the rate of false positives of GWAS. StepLMM has a high performance in both GWAS and GS and is feasible for agricultural breeding programs and human genetic studies.
引用
收藏
页数:8
相关论文
共 39 条
  • [1] Genomewide rapid association using mixed model and regression: A fast and simple method for genomewide pedigree-based quantitative trait loci association analysis
    Aulchenko, Yurii S.
    de Koning, Dirk-Jan
    Haley, Chris
    [J]. GENETICS, 2007, 177 (01) : 577 - 585
  • [2] Extended Bayesian information criteria for model selection with large model spaces
    Chen, Jiahua
    Chen, Zehua
    [J]. BIOMETRIKA, 2008, 95 (03) : 759 - 771
  • [3] Demeure O, 2012, P 16 QTL MAS WORKSH
  • [4] Genomic selection: genome-wide prediction in plant improvement
    Desta, Zeratsion Abera
    Ortiz, Rodomiro
    [J]. TRENDS IN PLANT SCIENCE, 2014, 19 (09) : 592 - 601
  • [5] Fisher R. A., 1919, Transactions of the Royal Society of Edinburgh, V52
  • [6] Garcia Gamez E, 2012, P 16 QTL MAS WORKSH
  • [7] Genome wide association analysis of the QTL MAS 2012 data investigating pleiotropy
    Christine Grosse-Brinkhaus
    Sarah Bergfelder
    Ernst Tholen
    [J]. BMC Proceedings, 8 (Suppl 5)
  • [8] Comparison between genomic predictions using daughter yield deviation and conventional estimated breeding value as response variables
    Guo, G.
    Lund, M. S.
    Zhang, Y.
    Su, G.
    [J]. JOURNAL OF ANIMAL BREEDING AND GENETICS, 2010, 127 (06) : 423 - 432
  • [9] Extension of the bayesian alphabet for genomic selection
    Habier, David
    Fernando, Rohan L.
    Kizilkaya, Kadir
    Garrick, Dorian J.
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [10] Invited review: Genomic selection in dairy cattle: Progress and challenges
    Hayes, B. J.
    Bowman, P. J.
    Chamberlain, A. J.
    Goddard, M. E.
    [J]. JOURNAL OF DAIRY SCIENCE, 2009, 92 (02) : 433 - 443