Sample size determination for training set optimization in genomic prediction

被引:9
|
作者
Wu, Po-Ya [1 ,2 ]
Ou, Jen-Hsiang [1 ,3 ]
Liao, Chen-Tuo [1 ]
机构
[1] Natl Taiwan Univ, Dept Agron, Taipei, Taiwan
[2] Heinrich Heine Univ, Inst Quant Genet & Genom Plants, Dusseldorf, Germany
[3] Uppsala Univ, Dept Med Biochem & Microbiol, Uppsala, Sweden
关键词
CALIBRATION SET; LINEAR-MODELS; SELECTION; ACCURACY; INDIVIDUALS; REGRESSION; PRECISION;
D O I
10.1007/s00122-023-04254-9
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
Genomic prediction (GP) is a statistical method used to select quantitative traits in animal or plant breeding. For this purpose, a statistical prediction model is first built that uses phenotypic and genotypic data in a training set. The trained model is then used to predict genomic estimated breeding values (GEBVs) for individuals within a breeding population. Setting the sample size of the training set usually takes into account time and space constraints that are inevitable in an agricultural experiment. However, the determination of the sample size remains an unresolved issue for a GP study. By applying the logistic growth curve to identify prediction accuracy for the GEBVs and the training set size, a practical approach was developed to determine a cost-effective optimal training set for a given genome dataset with known genotypic data. Three real genome datasets were used to illustrate the proposed approach. An R function is provided to facilitate widespread application of this approach to sample size determination, which can help breeders to identify a set of genotypes with an economical sample size for selective phenotyping.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Genomic prediction and training set optimization in a structured Mediterranean oat population
    Rio, Simon
    Gallego-Sanchez, Luis
    Montilla-Bascon, Gracia
    Canales, Francisco J.
    Sanchez, Julio Isidro y
    Prats, Elena
    THEORETICAL AND APPLIED GENETICS, 2021, 134 (11) : 3595 - 3609
  • [2] Training set optimization of genomic prediction by means of EthAcc
    Mangin, Brigitte
    Rincent, Renaud
    Rabier, Charles-Elie
    Moreau, Laurence
    Goudemand-Dugue, Ellen
    PLOS ONE, 2019, 14 (02):
  • [3] Training set determination for genomic selection
    Ou, Jen-Hsiang
    Liao, Chen-Tuo
    THEORETICAL AND APPLIED GENETICS, 2019, 132 (10) : 2781 - 2792
  • [4] Sparse kernel models provide optimization of training set design for genomic prediction in multiyear wheat breeding data
    Lopez-Cruz, Marco
    Dreisigacker, Susanne
    Crespo-Herrera, Leonardo
    Bentley, Alison R.
    Singh, Ravi
    Poland, Jesse
    Shrestha, Sandesh
    Huerta-Espino, Julio
    Govindan, Velu
    Juliana, Philomin
    Mondal, Suchismita
    Perez-Rodriguez, Paulino
    Crossa, Jose
    PLANT GENOME, 2022, 15 (04)
  • [5] Training Set Construction for Genomic Prediction in Auto-Tetraploids: An Example in Potato
    Wilson, Stefan
    Malosetti, Marcos
    Maliepaard, Chris
    Mulder, Han A.
    Visser, Richard G. F.
    van Eeuwijk, Fred
    FRONTIERS IN PLANT SCIENCE, 2021, 12
  • [6] Genomic prediction in hybrid breeding: I. Optimizing the training set design
    Melchinger, Albrecht E. E.
    Fernando, Rohan
    Stricker, Christian
    Schoen, Chris-Carolin
    Auinger, Hans-Juergen
    THEORETICAL AND APPLIED GENETICS, 2023, 136 (08)
  • [7] A Function Accounting for Training Set Size and Marker Density to Model the Average Accuracy of Genomic Prediction
    Erbe, Malena
    Gredler, Birgit
    Seefried, Franz Reinhold
    Bapst, Beat
    Simianer, Henner
    PLOS ONE, 2013, 8 (12):
  • [8] Genomic prediction using training population design in interspecific soybean populations
    Beche, Eduardo
    Gillman, Jason D.
    Song, Qijian
    Nelson, Randall
    Beissinger, Tim
    Decker, Jared
    Shannon, Grover
    Scaboo, Andrew M.
    MOLECULAR BREEDING, 2021, 41 (02)
  • [9] Training set design in genomic prediction with multiple biparental families
    Zhu, Xintian
    Leiser, Willmar L.
    Hahn, Volker
    Wuerschum, Tobias
    PLANT GENOME, 2021, 14 (03)
  • [10] Training set optimization under population structure in genomic selection
    Isidro, Julio
    Jannink, Jean-Luc
    Akdemir, Deniz
    Poland, Jesse
    Heslot, Nicolas
    Sorrells, Mark E.
    THEORETICAL AND APPLIED GENETICS, 2015, 128 (01) : 145 - 158