Automated tetraploid genotype calling by hierarchical clustering

被引:48
作者
Carley, Cari A. Schmitz [1 ]
Coombs, Joseph J. [2 ]
Douches, David S. [2 ]
Bethke, Paul C. [1 ,3 ]
Palta, Jiwan P. [1 ]
Novy, Richard G. [4 ]
Endelman, Jeffrey B. [1 ]
机构
[1] Univ Wisconsin, Dept Hort, Madison, WI 53706 USA
[2] Michigan State Univ, Dept Plant Soil & Microbial Sci, E Lansing, MI 48824 USA
[3] USDA ARS, Madison, WI 53706 USA
[4] USDA ARS, Small Grains & Potato Germplasm Res Unit, Aberdeen, ID 83210 USA
基金
美国食品与农业研究所;
关键词
POTATO SOLANUM-TUBEROSUM; GENETIC-LINKAGE MAP; MULTIPLE ALLELES; CONSTRUCTION; SEGREGATION; INHERITANCE; POLYPLOIDY; QUALITY; MARKERS; GENOME;
D O I
10.1007/s00122-016-2845-5
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
New software to make tetraploid genotype calls from SNP array data was developed, which uses hierarchical clustering and multiple F1 populations to calibrate the relationship between signal intensity and allele dosage. SNP arrays are transforming breeding and genetics research for autotetraploids. To fully utilize these arrays, the relationship between signal intensity and allele dosage must be calibrated for each marker. We developed an improved computational method to automate this process, which is provided as the R package ClusterCall. In the training phase of the algorithm, hierarchical clustering within an F1 population is used to group samples with similar intensity values, and allele dosages are assigned to clusters based on expected segregation ratios. In the prediction phase, multiple F1 populations and the prediction set are clustered together, and the genotype for each cluster is the mode of the training set samples. A concordance metric, defined as the proportion of training set samples equal to the mode, can be used to eliminate unreliable markers and compare different algorithms. Across three potato families genotyped with an 8K SNP array, ClusterCall scored 5729 markers with at least 0.95 concordance (94.6% of its total), compared to 5325 with the software fitTetra (82.5% of its total). The three families were used to predict genotypes for 5218 SNPs in the SolCAP diversity panel, compared with 3521 SNPs in a previous study in which genotypes were called manually. One of the additional markers produced a significant association for vine maturity near a well-known causal locus on chromosome 5. In conclusion, when multiple F1 populations are available, ClusterCall is an efficient method for accurate, autotetraploid genotype calling that enables the use of SNP data for research and plant breeding.
引用
收藏
页码:717 / 726
页数:10
相关论文
共 41 条
  • [1] The Double-Reduction Landscape in Tetraploid Potato as Revealed by a High-Density Linkage Map
    Bourke, Peter M.
    Voorrips, Roeland E.
    Visser, Richard G. F.
    Maliepaard, Chris
    [J]. GENETICS, 2015, 201 (03) : 853 - U94
  • [2] QTL mapping of yield, agronomic and quality traits in tetraploid potato (Solanum tuberosum subsp tuberosum)
    Bradshaw, John E.
    Hackett, Christine A.
    Pande, Barnaly
    Waugh, Robbie
    Bryan, Glenn J.
    [J]. THEORETICAL AND APPLIED GENETICS, 2008, 116 (02) : 193 - 211
  • [3] A molecular marker linkage map of tetraploid alfalfa (Medicago sativa L.)
    Brouwer, DJ
    Osborn, TC
    [J]. THEORETICAL AND APPLIED GENETICS, 1999, 99 (7-8) : 1194 - 1200
  • [4] The advantages and disadvantages of being polyploid
    Comai, L
    [J]. NATURE REVIEWS GENETICS, 2005, 6 (11) : 836 - 846
  • [5] The Contribution of the Solanaceae Coordinated Agricultural Project to Potato Breeding
    Douches, D.
    Hirsch, C. N.
    Manrique-Carpintero, N. C.
    Massa, A. N.
    Coombs, J.
    Hardigan, M.
    Bisognin, D.
    De Jong, W.
    Buell, C. R.
    [J]. POTATO RESEARCH, 2014, 57 (3-4) : 215 - 224
  • [6] Genetic mapping with an inbred line-derived F2 population in potato
    Endelman, Jeffrey B.
    Jansky, Shelley H.
    [J]. THEORETICAL AND APPLIED GENETICS, 2016, 129 (05) : 935 - 943
  • [7] Integration of Two Diploid Potato Linkage Maps with the Potato Genome Sequence
    Felcher, Kimberly J.
    Coombs, Joseph J.
    Massa, Alicia N.
    Hansey, Candice N.
    Hamilton, John P.
    Veilleux, Richard E.
    Buell, C. Robin
    Douches, David S.
    [J]. PLOS ONE, 2012, 7 (04):
  • [8] Gallais A., 2003, Quantitative genetics and breeding methods in autopolyploid plants
  • [9] Linkage Analysis and QTL Mapping Using SNP Dosage Data in a Tetraploid Potato Mapping Population
    Hackett, Christine A.
    McLean, Karen
    Bryan, Glenn J.
    [J]. PLOS ONE, 2013, 8 (05):
  • [10] Single nucleotide polymorphism discovery in elite north american potato germplasm
    Hamilton, John P.
    Hansey, Candice N.
    Whitty, Brett R.
    Stoffel, Kevin
    Massa, Alicia N.
    Van Deynze, Allen
    De Jong, Walter S.
    Douches, David S.
    Buell, C. Robin
    [J]. BMC GENOMICS, 2011, 12