Evaluating the use of statistical and machine learning methods for estimating breed composition of purebred and crossbred animals in thirteen cattle breeds using genomic information

被引:3
作者
Ryan, C. A. [1 ,2 ]
Berry, D. P. [1 ]
O'Brien, A. [1 ]
Pabiou, T. [3 ]
Purfield, D. C. [2 ]
机构
[1] Teagasc, Midleton, Co Cork, Ireland
[2] Munster Technol Univ, Cork, Ireland
[3] Irish Cattle Breeding Federat, Cork, Ireland
关键词
genomic breed composition; cattle; crossbred; population assignment; low-density panels; best linear unbiased prediction; Admixture; genetic diversity; PARTIAL LEAST-SQUARES; POPULATION-STRUCTURE; DISCRIMINANT-ANALYSIS; ASSIGNMENT; DAIRY; IDENTIFICATION; ANCESTRY; TRAITS; SNPS;
D O I
10.3389/fgene.2023.1120312
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Introduction: The ability to accurately predict breed composition using genomic information has many potential uses including increasing the accuracy of genetic evaluations, optimising mating plans and as a parameter for genotype quality control. The objective of the present study was to use a database of genotyped purebred and crossbred cattle to compare breed composition predictions using a freely available software, Admixture, with those from a single nucleotide polymorphism Best Linear Unbiased Prediction (SNP-BLUP) approach; a supplementary objective was to determine the accuracy and general robustness of low-density genotype panels for predicting breed composition.Methods: All animals had genotype information on 49,213 autosomal single nucleotide polymorphism (SNPs). Thirteen breeds were included in the analysis and 500 purebred animals per breed were used to establish the breed training populations. Accuracy of breed composition prediction was determined using a separate validation population of 3,146 verified purebred and 4,330 two and three-way crossbred cattle.Results: When all 49,213 autosomal SNPs were used for breed prediction, a minimal absolute mean difference of 0.04 between Admixture vs. SNP-BLUP breed predictions was evident. For crossbreds, the average absolute difference in breed prediction estimates generated using SNP-BLUP and Admixture was 0.068 with a root mean square error of 0.08. Breed predictions from low-density SNP panels were generated using both SNP-BLUP and Admixture and compared to breed prediction estimates using all 49,213 SNPs (representing the gold standard). Breed composition estimates of crossbreds required more SNPs than predicting the breed composition of purebreds. SNP-BLUP required =3,000 SNPs to predict crossbred breed composition, but only 2,000 SNPs were required to predict purebred breed status. The absolute mean (standard deviation) difference across all panels <2,000 SNPs was 0.091 (0.054) and 0.315 (0.316) when predicting the breed composition of all animals using Admixture and SNP-BLUP, respectively compared to the gold standard prediction.Discussion: Nevertheless, a negligible absolute mean (standard deviation) difference of 0.009 (0.123) in breed prediction existed between SNP-BLUP and Admixture once =3,000 SNPs were considered, indicating that the prediction of breed composition could be readily integrated into SNP-BLUP pipelines used for genomic evaluations thereby avoiding the necessity for a stand-alone software.
引用
收藏
页数:13
相关论文
共 58 条
[1]   Fast model-based estimation of ancestry in unrelated individuals [J].
Alexander, David H. ;
Novembre, John ;
Lange, Kenneth .
GENOME RESEARCH, 2009, 19 (09) :1655-1664
[2]   Partial least squares for discrimination [J].
Barker, M ;
Rayens, W .
JOURNAL OF CHEMOMETRICS, 2003, 17 (03) :166-173
[3]   Combined use of principal component analysis and random forests identify population-informative single nucleotide polymorphisms: application in cattle breeds [J].
Bertolini, F. ;
Galimberti, G. ;
Calo, D. G. ;
Schiavo, G. ;
Matassino, D. ;
Fontanesi, L. .
JOURNAL OF ANIMAL BREEDING AND GENETICS, 2015, 132 (05) :346-356
[4]   Evaluation of factors affecting individual assignment precision using microsatellite data from horse breeds and simulated breed crosses [J].
Bjornstad, G ;
Roed, KH .
ANIMAL GENETICS, 2002, 33 (04) :264-270
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Partial least squares discriminant analysis: taking the magic away [J].
Brereton, Richard G. ;
Lloyd, Gavin R. .
JOURNAL OF CHEMOMETRICS, 2014, 28 (04) :213-225
[8]   Ancestry informative markers derived from discriminant analysis of principal components provide important insights into the composition of crossbred cattle [J].
Chhotaray, Supriya ;
Panigrahi, Manjit ;
Pal, Dhan ;
Ahmad, Sheikh Firdous ;
Bhushan, Bharat ;
Gaur, G. K. ;
Mishra, B. P. ;
Singh, R. K. .
GENOMICS, 2020, 112 (02) :1726-1733
[9]   Lamb meat quality assessment by support vector machines [J].
Cortez, Paulo ;
Portelinha, Manuel ;
Rodrigues, Sandra ;
Cadavez, Vasco ;
Teixeira, Alfredo .
NEURAL PROCESSING LETTERS, 2006, 24 (01) :41-51
[10]   Breed assignment test in four Italian beef cattle breeds [J].
Dalvit, C. ;
De Marchi, M. ;
Dal Zotto, R. ;
Gervaso, M. ;
Meuwissen, T. ;
Cassandro, M. .
MEAT SCIENCE, 2008, 80 (02) :389-395