Prediction of Winter Wheat Yield Based on Multi-Source Data and Machine Learning in China

被引:186
作者
Han, Jichong [1 ]
Zhang, Zhao [1 ]
Cao, Juan [1 ]
Luo, Yuchuan [1 ]
Zhang, Liangliang [1 ]
Li, Ziyue [1 ]
Zhang, Jing [1 ]
机构
[1] Beijing Normal Univ, Fac Geog Sci, MoE Key Lab Environm Change & Nat Hazards, State Key Lab Earth Surface Proc & Resource Ecol, Beijing 100875, Peoples R China
关键词
wheat yield prediction; multi-source data; machine learning; Google Earth Engine (GEE); Triticum aestivum L; LEAF-AREA INDEX; SENSED VEGETATION INDEXES; CROP YIELD; CLIMATE-CHANGE; HEAT-STRESS; MODIS-NDVI; STATISTICAL-MODELS; LOESS PLATEAU; PLANTING DATE; GRAIN YIELDS;
D O I
10.3390/rs12020236
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Wheat is one of the main crops in China, and crop yield prediction is important for regional trade and national food security. There are increasing concerns with respect to how to integrate multi-source data and employ machine learning techniques to establish a simple, timely, and accurate crop yield prediction model at an administrative unit. Many previous studies were mainly focused on the whole crop growth period through expensive manual surveys, remote sensing, or climate data. However, the effect of selecting different time window on yield prediction was still unknown. Thus, we separated the whole growth period into four time windows and assessed their corresponding predictive ability by taking the major winter wheat production regions of China as an example in the study. Firstly we developed a modeling framework to integrate climate data, remote sensing data and soil data to predict winter wheat yield based on the Google Earth Engine (GEE) platform. The results show that the models can accurately predict yield 1 similar to 2 months before the harvesting dates at the county level in China with an R-2 > 0.75 and yield error less than 10%. Support vector machine (SVM), Gaussian process regression (GPR), and random forest (RF) represent the top three best methods for predicting yields among the eight typical machine learning models tested in this study. In addition, we also found that different agricultural zones and temporal training settings affect prediction accuracy. The three models perform better as more winter wheat growing season information becomes available. Our findings highlight a potentially powerful tool to predict yield using multiple-source data and machine learning in other regions and for crops.
引用
收藏
页数:22
相关论文
共 122 条
[1]   Data Descriptor: TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958-2015 [J].
Abatzoglou, John T. ;
Dobrowski, Solomon Z. ;
Parks, Sean A. ;
Hegewisch, Katherine C. .
SCIENTIFIC DATA, 2018, 5
[2]   Machine Learning Regression Techniques for the Silage Maize Yield Prediction Using Time-Series Images of Landsat 8 OLI [J].
Aghighi, Hossein ;
Azadbakht, Mohsen ;
Ashourloo, Davoud ;
Shahrabi, Hamid Salehi ;
Radiom, Soheil .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2018, 11 (12) :4563-4577
[3]  
AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
[4]  
[Anonymous], P ENV EFF CROP PHYSL
[5]  
[Anonymous], 2008, GLOBAL AGROECOLOGICA
[6]  
[Anonymous], 1998, ISIS TECH REP
[7]  
[Anonymous], 2007, APPL GIS
[8]   Evaluating machine learning approaches for the interpolation of monthly air temperature at Mt. Kilimanjaro, Tanzania [J].
Appelhans, Tim ;
Mwangomo, Ephraim ;
Hardy, Douglas R. ;
Hemp, Andreas ;
Nauss, Thomas .
SPATIAL STATISTICS, 2015, 14 :91-113
[9]   A survey of cross-validation procedures for model selection [J].
Arlot, Sylvain ;
Celisse, Alain .
STATISTICS SURVEYS, 2010, 4 :40-79
[10]   Empirical regression models using NDVI, rainfall and temperature data for the early prediction of wheat grain yields in Morocco [J].
Balaghi, Riad ;
Tychon, Bernard ;
Eerens, Herman ;
Jlibene, Mohammed .
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2008, 10 (04) :438-452