Sequence imputation from low density single nucleotide polymorphism panel in a black poplar breeding population

被引:4
作者
Pegard, Marie [1 ]
Rogier, Odile [1 ]
Berard, Aurelie [2 ]
Faivre-Rampant, Patricia [2 ]
Le Paslier, Marie-Christine [2 ]
Bastien, Catherine [1 ]
Jorge, Veronique [1 ]
Sanchez, Leopoldo [1 ]
机构
[1] INRA, ONF, BioForA, 2163 Ave Pomme Pin CS 40001 ARDON, F-45075 Orleans 2, France
[2] Univ Paris Saclay, EPGV, INRA, 2 Rue Gaston Cremieux, F-9100 Evry, France
关键词
Genotype Imputation; Low density arrays; Whole-Genome Resequencing; Populus nigra; WHOLE-GENOME ASSOCIATION; GENOTYPE IMPUTATION; LINKAGE DISEQUILIBRIUM; MISSING GENOTYPES; WIDE ASSOCIATION; ACCURACY; SELECTION; STRATEGIES; PREDICTION; EFFICIENCY;
D O I
10.1186/s12864-019-5660-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundGenomic selection accuracy increases with the use of high SNP (single nucleotide polymorphism) coverage. However, such gains in coverage come at high costs, preventing their prompt operational implementation by breeders. Low density panels imputed to higher densities offer a cheaper alternative during the first stages of genomic resources development. Our study is the first to explore the imputation in a tree species: black poplar. About 1000 pure-breed Populus nigra trees from a breeding population were selected and genotyped with a 12K custom Infinium Bead-Chip. Forty-three of those individuals corresponding to nodal trees in the pedigree were fully sequenced (reference), while the remaining majority (target) was imputed from 8K to 1.4 million SNPs using FImpute. Each SNP and individual was evaluated for imputation errors by leave-one-out cross validation in the training sample of 43 sequenced trees. Some summary statistics such as Hardy-Weinberg Equilibrium exact test p-value, quality of sequencing, depth of sequencing per site and per individual, minor allele frequency, marker density ratio or SNP information redundancy were calculated. Principal component and Boruta analyses were used on all these parameters to rank the factors affecting the quality of imputation. Additionally, we characterize the impact of the relatedness between reference population and target population.ResultsDuring the imputation process, we used 7540 SNPs from the chip to impute 1,438,827 SNPs from sequences. At the individual level, imputation accuracy was high with a proportion of SNPs correctly imputed between 0.84 and 0.99. The variation in accuracies was mostly due to differences in relatedness between individuals. At a SNP level, the imputation quality depended on genotyped SNP density and on the original minor allele frequency. The imputation did not appear to result in an increase of linkage disequilibrium. The genotype densification not only brought a better distribution of markers all along the genome, but also we did not detect any substantial bias in annotation categories.ConclusionsThis study shows that it is possible to impute low-density marker panels to whole genome sequence with good accuracy under certain conditions that could be common to many breeding populations.
引用
收藏
页数:16
相关论文
共 66 条
[1]  
[Anonymous], 2015, PICARD TOOLS
[2]   Accuracy of Estimation of Genomic Breeding Values in Pigs Using Low-Density Genotypes and Imputation [J].
Badke, Yvonne M. ;
Bates, Ronald O. ;
Ernst, Catherine W. ;
Fix, Justin ;
Steibel, Juan P. .
G3-GENES GENOMES GENETICS, 2014, 4 (04) :623-631
[3]   Imputation of non-genotyped sheep from the genotypes of their mates and resulting progeny [J].
Berry, D. P. ;
McHugh, N. ;
Randles, S. ;
Wall, E. ;
McDermott, K. ;
Sargolzaei, M. ;
O'Brien, A. C. .
ANIMAL, 2018, 12 (02) :191-198
[4]   Within-and across-breed imputation of high-density genotypes in dairy and beef cattle from medium-and low-density genotypes [J].
Berry, D. P. ;
McClure, M. C. ;
Mullen, M. P. .
JOURNAL OF ANIMAL BREEDING AND GENETICS, 2014, 131 (03) :165-172
[5]   Trimmomatic: a flexible trimmer for Illumina sequence data [J].
Bolger, Anthony M. ;
Lohse, Marc ;
Usadel, Bjoern .
BIOINFORMATICS, 2014, 30 (15) :2114-2120
[6]   Strategies for imputation to whole genome sequence using a single or multi-breed reference population in cattle [J].
Brondum, Rasmus Froberg ;
Guldbrandtsen, Bernt ;
Sahana, Goutam ;
Lund, Mogens Sando ;
Su, Guosheng .
BMC GENOMICS, 2014, 15
[7]   Genotype Imputation with Millions of Reference Samples [J].
Browning, Brian L. ;
Browning, Sharon R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2016, 98 (01) :116-126
[8]   Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering [J].
Browning, Sharon R. ;
Browning, Brian L. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) :1084-1097
[9]   Haplotype phasing: existing methods and new developments [J].
Browning, Sharon R. ;
Browning, Brian L. .
NATURE REVIEWS GENETICS, 2011, 12 (10) :703-714
[10]   In silico method for inferring genotypes in pedigrees [J].
Burdick, Joshua T. ;
Chen, Wei-Min ;
Abecasis, Goncalo R. ;
Cheung, Vivian G. .
NATURE GENETICS, 2006, 38 (09) :1002-1004