Ensemble learning prediction of soybean yields in China based on meteorological data

被引:27
作者
Li, Qian-chuan [1 ]
Xu, Shi-wei [1 ,2 ,5 ]
Zhuang, Jia-yu [1 ,5 ]
Liu, Jia-Jia [2 ]
Zhou, Yi [3 ]
Zhang, Ze-xi [4 ]
机构
[1] Chinese Acad Agr Sci, Agr Informat Inst, Beijing 100081, Peoples R China
[2] Beijing Engn Res Ctr Agr Monitoring & Early Warnin, Beijing 100081, Peoples R China
[3] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[4] Columbia Univ, Dept Math, New York, NY 10027 USA
[5] Minist Agr & Rural Affairs, Key Lab Agr Monitoring & Early Warning Technol, Beijing 100081, Peoples R China
关键词
meteorological factors; ensemble learning; crop yield prediction; machine learning; county-level; CLIMATE DATA; WHEAT YIELD; CROP YIELD; TRENDS; TEMPERATURE; PATTERNS; RAINFALL;
D O I
10.1016/j.jia.2023.02.011
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
The accurate prediction of soybean yield is of great significance for agricultural production, monitoring and early warning. Although previous studies have used machine learning algorithms to predict soybean yield based on meteorological data, it is not clear how different models can be used to effectively separate soybean meteorological yield from soybean yield in various regions. In addition, comprehensively integrating the advantages of various machine learning algorithms to improve the prediction accuracy through ensemble learning algorithms has not been studied in depth. This study used and analyzed various daily meteorological data and soybean yield data from 173 county-level administrative regions and meteorological stations in two principal soybean planting areas in China (Northeast China and the Huang-Huai region), covering 34 years. Three effective machine learning algorithms (K-nearest neighbor, random forest, and support vector regression) were adopted as the base-models to establish a high-precision and highly-reliable soybean meteorological yield prediction model based on the stacking ensemble learning framework. The model's generalizability was further improved through 5-fold crossvalidation, and the model was optimized by principal component analysis and hyperparametric optimization. The accuracy of the model was evaluated by using the five-year sliding prediction and four regression indicators of the 173 counties, which showed that the stacking model has higher accuracy and stronger robustness. The 5-year sliding estimations of soybean yield based on the stacking model in 173 counties showed that the prediction effect can reflect the spatiotemporal distribution of soybean yield in detail, and the mean absolute percentage error (MAPE) was less than 5%. The stacking prediction model of soybean meteorological yield provides a new approach for accurately predicting soybean yield.
引用
收藏
页码:1909 / 1927
页数:19
相关论文
共 86 条
[51]   Climatic and environmental drivers on temporal-spatial variations of grain meteorological yield in high mountainous region [J].
Rong, Li ;
Duan, Xingwu ;
Gu, Zhijia ;
Feng, Detai .
ARCHIVES OF AGRONOMY AND SOIL SCIENCE, 2021, 67 (14) :2000-2014
[52]   Recent trends in rainfall and temperature over North West India during 1871-2016 [J].
Saxena, Rani ;
Mathur, Prasoon .
THEORETICAL AND APPLIED CLIMATOLOGY, 2019, 135 (3-4) :1323-1338
[53]   Machine learning for high-throughput field phenotyping and image processing provides insight into the association of above and below-ground traits in cassava (Manihot esculentaCrantz) [J].
Selvaraj, Michael Gomez ;
Valderrama, Manuel ;
Guzman, Diego ;
Valencia, Milton ;
Ruiz, Henry ;
Acharjee, Animesh .
PLANT METHODS, 2020, 16 (01)
[54]   Sequential forward selection and support vector regression in comparison to LASSO regression for spring wheat yield prediction based on UAV imagery [J].
Shafiee, Sahameh ;
Lied, Lars Martin ;
Burud, Ingunn ;
Dieseth, Jon Arne ;
Alsheikh, Muath ;
Lillemo, Morten .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 183
[55]   An Ensemble Methods for Medical Insurance Costs Prediction Task [J].
Shakhovska, Nataliya ;
Melnykova, Nataliia ;
Chopiyak, Valentyna ;
Ml, Michal Gregus .
CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (02) :3969-3984
[56]   Time series analysis of temperature and rainfall-based weather aggregation reveals significant correlations between climate turning points and potato (Solanum tuberosum L) yield trends in Japan [J].
Shimoda, Seiji ;
Kanno, Hiromitsu ;
Hirota, Tomoyoshi .
AGRICULTURAL AND FOREST METEOROLOGY, 2018, 263 :147-155
[57]   Trait selection by path and principal component analysis in Jatropha curcas for enhanced oil yield [J].
Singh, Sarnam ;
Prakash, Aruna ;
Chakraborty, N. R. ;
Wheeler, Candace ;
Agarwal, P. K. ;
Ghosh, Arup .
INDUSTRIAL CROPS AND PRODUCTS, 2016, 86 :173-179
[58]   Winter wheat yield prediction using convolutional neural networks from environmental and phenological data [J].
Srivastava, Amit Kumar ;
Safaei, Nima ;
Khaki, Saeed ;
Lopez, Gina ;
Zeng, Wenzhi ;
Ewert, Frank ;
Gaiser, Thomas ;
Rahimi, Jaber .
SCIENTIFIC REPORTS, 2022, 12 (01)
[59]   Augmentation of maize yield by strategic adaptation to cope with climate change for a future period in Eastern India [J].
Srivastava, Rajiv Kumar ;
Mequanint, Fasi ;
Chakraborty, Arun ;
Panda, Rabindra Kumar ;
Halder, Debjani .
JOURNAL OF CLEANER PRODUCTION, 2022, 339
[60]   County-Level Soybean Yield Prediction Using Deep CNN-LSTM Model [J].
Sun, Jie ;
Di, Liping ;
Sun, Ziheng ;
Shen, Yonglin ;
Lai, Zulong .
SENSORS, 2019, 19 (20)