Using machine learning algorithms to cluster and classify stone pine (Pinus pinea L.) populations based on seed and seedling characteristics

被引:1
作者
Caliskan, Servet [1 ]
Kartal, Elif [2 ]
Balekoglu, Safa [1 ]
Caliskan, Fatma [3 ]
机构
[1] Istanbul Univ Cerrahpasa, Fac Forestry, Silviculture Dept, Istanbul, Turkiye
[2] Istanbul Univ, Fac Econ, Dept Management Informat Syst, TR-34116 Istanbul, Turkiye
[3] Istanbul Univ, Fac Sci, Dept Math, Istanbul, Turkiye
关键词
Afforestation; Breeding; Machine learning; Phenotyping; Supervised learning; Unsupervised learning; CONE; GERMINATION;
D O I
10.1007/s10342-024-01716-7
中图分类号
S7 [林业];
学科分类号
0829 ; 0907 ;
摘要
The phenotype of a woody plant represents its unique morphological properties. Population discrimination and individual classification are crucial for breeding populations and conserving genetic diversity. Machine Learning (ML) algorithms are gaining traction as powerful tools for predicting phenotypes. The present study is focused on classifying and clustering the seeds and seedlings in terms of morphological characteristics using ML algorithms. In addition, the k-means algorithm is used to determine the ideal number of clusters. The results obtained from the k-means algorithm were then compared with reality. The best classification performance achieved by the Random Forest algorithm was an accuracy of 0.648 and an F1-Score of 0.658 for the seed traits. Also, the best classification performance for stone pine seedlings was observed for the k-Nearest Neighbors algorithm (k = 18), for which the accuracy and F1-Score were 0.571 and 0.582, respectively. The best clustering performance was achieved with k = 2 for the seed (average Silhouette index = 0.48) and seedling (average Silhouette Index = 0.51) traits. According to the principal component analysis, two dimensions accounted for 97% and 63% of the traits of seeds and seedlings, respectively. The most important features between the seed and seedling traits were cone weight and bud set, respectively. This study will provide a foundation and motivation for future efforts in forest management practices, particularly regarding reforestation, yield optimization, and breeding programs.
引用
收藏
页码:1575 / 1591
页数:17
相关论文
共 76 条
[1]  
[Anonymous], 2023, R: a language and environment for statistical computing
[2]  
Balaban ME, 2018, VERI MADENCILIGI MAK
[3]   An experimental assessment of carbon and nitrogen allocation in Pinus pinea populations under drought stress and rewatering treatment [J].
Balekoglu, Safa ;
Caliskan, Servet ;
Makineci, Ender ;
Dirik, Huseyin .
ENVIRONMENTAL AND EXPERIMENTAL BOTANY, 2023, 210
[4]   Response to drought stress differs among Pinus pinea provenances [J].
Balekoglu, Safa ;
Caliskan, Servet ;
Dirik, Huseyin ;
Rosner, Sabine .
FOREST ECOLOGY AND MANAGEMENT, 2023, 531
[5]  
Balekoglu S, 2021, ERWERBS-OBSTBAU, V63, P369, DOI 10.1007/s10341-021-00593-3
[6]   Effects of geoclimatic factors on the variability inPinus pineacone, seed, and seedling traits in Turkey native habitats [J].
Balekoglu, Safa ;
Caliskan, Servet ;
Dirik, Huseyin .
ECOLOGICAL PROCESSES, 2020, 9 (01)
[7]   Data Classification Using Feature Selection And kNN Machine Learning Approach [J].
Begum, Shemim ;
Chakraborty, Debasis ;
Sarkar, Ram .
2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, :811-814
[8]  
Boydak M., 2014, AFFORESTATION, Vfirst
[9]  
Boydak M, 2015, Afforestation in arid and semi-arid regions
[10]  
Boydak M, 2021, AFFORESTATION