Enlarging a training set for genomic selection by imputation of un-genotyped animals in populations of varying genetic architecture

被引:27
作者
Pimentel, Eduardo C. G. [1 ]
Wensch-Dorendorf, Monika [2 ]
Koenig, Sven [1 ]
Swalve, Hermann H. [2 ]
机构
[1] Univ Kassel, Dept Anim Breeding, D-37213 Witzenhausen, Germany
[2] Univ Halle Wittenberg, Inst Agr & Nutr Sci, D-06099 Halle, Germany
关键词
NUCLEOTIDE POLYMORPHISM GENOTYPES; LINKAGE DISEQUILIBRIUM; MISSING GENOTYPES; VARIABLE SELECTION; IMPUTE PHASE; ACCURACY; PREDICTION; VALUES; CATTLE; RELIABILITY;
D O I
10.1186/1297-9686-45-12
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
Background: The most common application of imputation is to infer genotypes of a high-density panel of markers on animals that are genotyped for a low-density panel. However, the increase in accuracy of genomic predictions resulting from an increase in the number of markers tends to reach a plateau beyond a certain density. Another application of imputation is to increase the size of the training set with un-genotyped animals. This strategy can be particularly successful when a set of closely related individuals are genotyped. Methods: Imputation on completely un-genotyped dams was performed using known genotypes from the sire of each dam, one offspring and the offspring's sire. Two methods were applied based on either allele or haplotype frequencies to infer genotypes at ambiguous loci. Results of these methods and of two available software packages were compared. Quality of imputation under different population structures was assessed. The impact of using imputed dams to enlarge training sets on the accuracy of genomic predictions was evaluated for different populations, heritabilities and sizes of training sets. Results: Imputation accuracy ranged from 0.52 to 0.93 depending on the population structure and the method used. The method that used allele frequencies performed better than the method based on haplotype frequencies. Accuracy of imputation was higher for populations with higher levels of linkage disequilibrium and with larger proportions of markers with more extreme allele frequencies. Inclusion of imputed dams in the training set increased the accuracy of genomic predictions. Gains in accuracy ranged from close to zero to 37.14%, depending on the simulated scenario. Generally, the larger the accuracy already obtained with the genotyped training set, the lower the increase in accuracy achieved by adding imputed dams. Conclusions: Whenever a reference population resembling the family configuration considered here is available, imputation can be used to achieve an extra increase in accuracy of genomic predictions by enlarging the training set with completely un-genotyped dams. This strategy was shown to be particularly useful for populations with lower levels of linkage disequilibrium, for genomic selection on traits with low heritability, and for species or breeds for which the size of the reference population is limited.
引用
收藏
页数:12
相关论文
共 39 条
  • [1] Linkage disequilibrium decay and haplotype block structure in the pig
    Amaral, Andreia J.
    Megens, Hendrik-Jan
    Crooijmans, Richard P. M. A.
    Heuven, Henri C. M.
    Groenen, Martien A. M.
    [J]. GENETICS, 2008, 179 (01) : 569 - 579
  • [2] Genome-wide associations for fertility traits in Holstein-Friesian dairy cows using data from experimental research herds in four European countries
    Berry, D. P.
    Bastiaansen, J. W. M.
    Veerkamp, R. F.
    Wijga, S.
    Wall, E.
    Berglund, B.
    Calus, M. P. L.
    [J]. ANIMAL, 2012, 6 (08) : 1206 - 1215
  • [3] A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals
    Browning, Brian L.
    Browning, Sharon R.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 84 (02) : 210 - 223
  • [4] Genotype imputation for the prediction of genomic breeding values in non-genotyped and low-density genotyped individuals
    Matthew A Cleveland
    John M Hickey
    Brian P Kinghorn
    [J]. BMC Proceedings, 5 (Suppl 3)
  • [5] Linkage disequilibrium and historical effective population size in the Thoroughbred horse
    Corbin, L. J.
    Blott, S. C.
    Swinburne, J. E.
    Vaudin, M.
    Bishop, S. C.
    Woolliams, J. A.
    [J]. ANIMAL GENETICS, 2010, 41 : 8 - 15
  • [6] Imputation of Missing Genotypes From Sparse to High Density Using Long-Range Phasing
    Daetwyler, Hans D.
    Wiggans, George R.
    Hayes, Ben J.
    Woolliams, John A.
    Goddard, Mike E.
    [J]. GENETICS, 2011, 189 (01) : 317 - U1028
  • [7] The Impact of Genetic Architecture on Genome-Wide Evaluation Methods
    Daetwyler, Hans D.
    Pong-Wong, Ricardo
    Villanueva, Beatriz
    Woolliams, John A.
    [J]. GENETICS, 2010, 185 (03) : 1021 - 1031
  • [8] Effect of imputing markers from a low-density chip on the reliability of genomic breeding values in Holstein populations
    Dassonneville, R.
    Brondum, R. F.
    Druet, T.
    Fritz, S.
    Guillaume, F.
    Guldbrandtsen, B.
    Lund, M. S.
    Ducrocq, V.
    Su, G.
    [J]. JOURNAL OF DAIRY SCIENCE, 2011, 94 (07) : 3679 - 3686
  • [9] A Hidden Markov Model Combining Linkage and Linkage Disequilibrium Information for Haplotype Reconstruction and Quantitative Trait Locus Fine Mapping
    Druet, Tom
    Georges, Michel
    [J]. GENETICS, 2010, 184 (03) : 789 - U237
  • [10] Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels
    Erbe, M.
    Hayes, B. J.
    Matukumalli, L. K.
    Goswami, S.
    Bowman, P. J.
    Reich, C. M.
    Mason, B. A.
    Goddard, M. E.
    [J]. JOURNAL OF DAIRY SCIENCE, 2012, 95 (07) : 4114 - 4129