Combined use of principal component analysis and random forests identify population-informative single nucleotide polymorphisms: application in cattle breeds
Allocation;
bovine genome;
informative SNPs;
population assignment;
SNP panel;
FEED-INTAKE;
TRACEABILITY;
SNP;
AUTHENTICATION;
ASSOCIATION;
ASSIGNMENT;
LOCUS;
GENE;
HOLSTEIN;
MARKERS;
D O I:
10.1111/jbg.12155
中图分类号:
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号:
0905 ;
摘要:
The genetic identification of the population of origin of individuals, including animals, has several practical applications in forensics, evolution, conservation genetics, breeding and authentication of animal products. Commercial high-density single nucleotide polymorphism (SNP) genotyping tools that have been recently developed in many species provide information from a large number of polymorphic sites that can be used to identify population-/breed-informative markers. In this study, starting from Illumina BOVINESNP50 V1 BEADCHIP array genotyping data available from 3711 cattle of four breeds (2091 Italian Holstein, 738 Italian Brown, 475 Italian Simmental and 407 Marchigiana), principal component analysis (PCA) and random forests (RFs) were combined to identify informative SNP panels useful for cattle breed identification. From a PCA preselected list of 580 SNPs, RFs were computed using ranking methods (Mean Decrease in the Gini Index and Mean Accuracy Decrease) to identify the most informative 48 and 96 SNPs for breed assignment. The out-of-bag (OOB) error rate for both ranking methods and SNP densities ranged from 0.0 to 0.1% in the reference population. Application of this approach in a test population (10% of individuals pre-extracted from the whole data set) achieved 100% of correct assignment with both classifiers. Linkage disequilibrium between selected SNPs was relevant (r(2) > 0.6) only in few pairs of markers indicating that most of the selected SNPs captured different fractions of variance. Several informative SNPs were in genes/QTL regions that affect or are associated with phenotypes or production traits that might differentiate the investigated breeds. The combination of PCA and RF to perform SNP selection and breed assignment can be easily implemented and is able to identify subsets of informative SNPs useful for population assignment starting from a large number of markers derived by high-throughput genotyping platforms.
机构:
Univ Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy
Univ Palermo, Dipartimento S En Fi Mi Zo, Sez Prod Anim, I-90128 Palermo, ItalyUniv Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy
Beretti, Francesca
;
Dall'Olio, Stefania
论文数: 0引用数: 0
h-index: 0
机构:
Univ Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, ItalyUniv Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy
Dall'Olio, Stefania
;
论文数: 引用数:
h-index:
机构:
Portolano, Baldassare
;
Matassino, Donato
论文数: 0引用数: 0
h-index: 0
机构:
ConSDABI, I-82100 Contrada Piano Cappelle, Benevento, ItalyUniv Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy
Matassino, Donato
;
Russo, Vincenzo
论文数: 0引用数: 0
h-index: 0
机构:
Univ Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, ItalyUniv Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy
机构:
Univ Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy
Univ Palermo, Dipartimento S En Fi Mi Zo, Sez Prod Anim, I-90128 Palermo, ItalyUniv Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy
Beretti, Francesca
;
Dall'Olio, Stefania
论文数: 0引用数: 0
h-index: 0
机构:
Univ Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, ItalyUniv Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy
Dall'Olio, Stefania
;
论文数: 引用数:
h-index:
机构:
Portolano, Baldassare
;
Matassino, Donato
论文数: 0引用数: 0
h-index: 0
机构:
ConSDABI, I-82100 Contrada Piano Cappelle, Benevento, ItalyUniv Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy
Matassino, Donato
;
Russo, Vincenzo
论文数: 0引用数: 0
h-index: 0
机构:
Univ Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, ItalyUniv Bologna, DIPROVAL, Sez Allevamenti Zootecn, I-42123 Reggio Emilia, Italy