Multi-dimensional machine learning approaches for fruit shape phenotyping in strawberry

被引:32
作者
Feldmann, Mitchell J. [1 ]
Hardigan, Michael A. [1 ]
Famula, Randi A. [1 ]
Lopez, Cindy M. [1 ]
Tabb, Amy [2 ]
Cole, Glenn S. [1 ]
Knapp, Steven J. [1 ]
机构
[1] Univ Calif Davis, Dept Plant Sci, 1 Shields Ave, Davis, CA 95616 USA
[2] USDA ARS AFRS, 2217 Wiltshire Rd, Kearneysville, WV 25430 USA
基金
美国食品与农业研究所;
关键词
Fragaria x ananassa; fruit shape; morphometrics; latent space phenotypes; machine learning; principal progression of k clusters; BAYESIAN CLASSIFICATION; QUANTITATIVE GENETICS; ENABLED PREDICTION; GENOMIC SELECTION; LEAF SHAPE; TOMATO; TRAITS; IMAGE; RESISTANCE; EVOLUTION;
D O I
10.1093/gigascience/giaa030
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Shape is a critical element of the visual appeal of strawberry fruit and is influenced by both genetic and non-genetic determinants. Current fruit phenotyping approaches for external characteristics in strawberry often rely on the human eye to make categorical assessments. However, fruit shape is an inherently multi-dimensional, continuously variable trait and not adequately described by a single categorical or quantitative feature. Morphometric approaches enable the study of complex, multi-dimensional forms but are often abstract and difficult to interpret. In this study, we developed a mathematical approach for transforming fruit shape classifications from digital images onto an ordinal scale called the Principal Progression of k Clusters (PPKC). We use these human-recognizable shape categories to select quantitative features extracted from multiple morphometric analyses that are best fit for genetic dissection and analysis. Results: We transformed images of strawberry fruit into human-recognizable categories using unsupervised machine learning, discovered 4 principal shape categories, and inferred progression using PPKC. We extracted 68 quantitative features from digital images of strawberries using a suite of morphometric analyses and multivariate statistical approaches. These analyses defined informative feature sets that effectively captured quantitative differences between shape classes. Classification accuracy ranged from 68% to 99% for the newly created phenotypic variables for describing a shape. Conclusions: Our results demonstrated that strawberry fruit shapes could be robustly quantified, accurately classified, and empirically ordered using image analyses, machine learning, and PPKC. We generated a dictionary of quantitative traits for studying and predicting shape classes and identifying genetic factors underlying phenotypic variability for fruit shape in strawberry. The methods and approaches that we applied in strawberry should apply to other fruits, vegetables, and specialty crops.
引用
收藏
页数:17
相关论文
共 98 条
[1]   AutoClass@IJM: a powerful tool for Bayesian classification of heterogeneous data in biology [J].
Achcar, Fiona ;
Camadro, Jean-Michel ;
Mestivier, Denis .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W63-W67
[2]  
Agresti A., 2010, ANAL ORDINAL CATEGOR, V656
[3]   Gain and loss of fruit flavor compounds produced by wild and cultivated strawberry species [J].
Aharoni, A ;
Giri, AP ;
Verstappen, FWA ;
Bertea, CM ;
Sevenier, R ;
Sun, ZK ;
Jongsma, MA ;
Schwab, W ;
Bouwmeester, HJ .
PLANT CELL, 2004, 16 (11) :3110-3131
[4]  
[Anonymous], PLA GENO
[5]  
[Anonymous], 2DSHAPEDESCRIPTION 2
[6]  
[Anonymous], THESIS
[7]  
[Anonymous], 2007, J ANIM BREED GENET, DOI DOI 10.1111/j.1439-0388.2007.00702.x
[8]  
[Anonymous], 2018, ISPRS
[9]  
[Anonymous], PLANT SYSTEMATICS
[10]  
[Anonymous], 1998, GENETICS ANAL QUANTI