A Comparative Analysis of XGBoost and Neural Network Models for Predicting Some Tomato Fruit Quality Traits from Environmental and Meteorological Data

被引:16
作者
M'hamdi, Oussama [1 ,2 ]
Takacs, Sandor [1 ]
Palotas, Gabor [3 ]
Ilahy, Riadh [4 ]
Helyes, Lajos [1 ]
Pek, Zoltan [1 ]
机构
[1] Hungarian Univ Agr & Life Sci, Inst Hort Sci, Pater K Str 1, H-2100 Godollo, Hungary
[2] Hungarian Univ Agr & Life Sci, Doctoral Sch Plant Sci, Pater K Str 1, H-2100 Godollo, Hungary
[3] Univer Prod Zrt, Szolnok Ut 35, H-6000 Kecskemet, Hungary
[4] Univ Carthage, Natl Agr Res Inst Tunisia INRAT, Lab Hort, Ariana 1004, Tunisia
来源
PLANTS-BASEL | 2024年 / 13卷 / 05期
关键词
tomato quality; extreme gradient boosting; artificial neural network; prediction; shapley additive explanations; CROP YIELD; LYCOPENE CONTENT; CLIMATE-CHANGE; SUGAR CONTENT; GENOTYPE; COLOR; CLASSIFICATION; AGRICULTURE; TEMPERATURE; METABOLISM;
D O I
10.3390/plants13050746
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
The tomato as a raw material for processing is globally important and is pivotal in dietary and agronomic research due to its nutritional, economic, and health significance. This study explored the potential of machine learning (ML) for predicting tomato quality, utilizing data from 48 cultivars and 28 locations in Hungary over 5 seasons. It focused on degrees Brix, lycopene content, and colour (a/b ratio) using extreme gradient boosting (XGBoost) and artificial neural network (ANN) models. The results revealed that XGBoost consistently outperformed ANN, achieving high accuracy in predicting degrees Brix (R-2 = 0.98, RMSE = 0.07) and lycopene content (R-2 = 0.87, RMSE = 0.61), and excelling in colour prediction (a/b ratio) with a R-2 of 0.93 and RMSE of 0.03. ANN lagged behind particularly in colour prediction, showing a negative R-2 value of -0.35. Shapley additive explanation's (SHAP) summary plot analysis indicated that both models are effective in predicting degrees Brix and lycopene content in tomatoes, highlighting different aspects of the data. SHAP analysis highlighted the models' efficiency (especially in degrees Brix and lycopene predictions) and underscored the significant influence of cultivar choice and environmental factors like climate and soil. These findings emphasize the importance of selecting and fine-tuning the appropriate ML model for enhancing precision agriculture, underlining XGBoost's superiority in handling complex agronomic data for quality assessment.
引用
收藏
页数:22
相关论文
共 135 条
[91]   Feature Interaction in Terms of Prediction Performance [J].
Oh, Sejong .
APPLIED SCIENCES-BASEL, 2019, 9 (23)
[92]   Magnitude of Genotype x Environment Interactions Affecting Tomato Fruit Quality [J].
Panthee, Dilip R. ;
Cao, Chunxue ;
Debenport, Spencer J. ;
Rodriguez, Gustavo R. ;
Labate, Joanne A. ;
Robertson, Larry D. ;
Breksa, Andrew P., III ;
van der Knaap, Esther ;
Gardener, Brian B. McSpadden .
HORTSCIENCE, 2012, 47 (06) :721-726
[93]   Climate change effects on the processing tomato growing season in California using growing degree day model [J].
Pathak T.B. ;
Stoddard C.S. .
Modeling Earth Systems and Environment, 2018, 4 (2) :765-775
[94]   The Optimization of Nitrogen Fertilization Regulates Crop Performance and Quality of Processing Tomato (Solanum lycopersicumL. cv. Heinz 3402) [J].
Petropoulos, Spyridon A. ;
Fernandes, Angela ;
Xyrafis, Eustratios ;
Polyzos, Nikolaos ;
Antoniadis, Vasileios ;
Barros, Lillian ;
Ferreira, Isabel C. F. R. .
AGRONOMY-BASEL, 2020, 10 (05)
[95]   Anomaly Detection: A Survey [J].
Chandola, Varun ;
Banerjee, Arindam ;
Kumar, Vipin .
ACM COMPUTING SURVEYS, 2009, 41 (03)
[96]   Genetic mapping of the tomato quality traits brix and blossom-end rot under supplemental LED and HPS lighting conditions [J].
Prinzenberg, Aina E. ;
van der Schoot, Hanneke ;
Visser, Richard G. F. ;
Marcelis, Leo F. M. ;
Heuvelink, Ep ;
Schouten, Henk J. .
EUPHYTICA, 2021, 217 (12)
[97]   Tomato Fruit Development and Metabolism [J].
Quinet, Muriel ;
Angosto, Trinidad ;
Yuste-Lisbona, Fernando J. ;
Blanchard-Gros, Remi ;
Bigot, Servane ;
Martinez, Juan-Pablo ;
Lutts, Stanley .
FRONTIERS IN PLANT SCIENCE, 2019, 10
[98]   Supervised feature selection using principal component analysis [J].
Rahmat, Fariq ;
Zulkafli, Zed ;
Ishak, Asnor Juraiza ;
Rahman, Ribhan Zafira Abdul ;
De Stercke, Simon ;
Buytaert, Wouter ;
Tahir, Wardah ;
Ab Rahman, Jamalludin ;
Ibrahim, Salwa ;
Ismail, Muhamad .
KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (03) :1955-1995
[99]  
Rao A. V., 2018, Lycopene and tomatoes in human nutrition and health, DOI [10.1201/9781351110877, DOI 10.1201/9781351110877]
[100]   Climate variation explains a third of global crop yield variability [J].
Ray, Deepak K. ;
Gerber, James S. ;
MacDonald, Graham K. ;
West, Paul C. .
NATURE COMMUNICATIONS, 2015, 6