Accurate prediction of sugarcane yield using a random forest algorithm

被引:230
作者
Everingham, Yvette [1 ,2 ]
Sexton, Justin [1 ,2 ]
Skocaj, Danielle [2 ,3 ]
Inman-Bamber, Geoff [2 ,4 ]
机构
[1] James Cook Univ, Ctr Trop Environm & Sustainabil Sci, Townsville, Qld 4811, Australia
[2] James Cook Univ, Coll Sci Technol & Engn, James Cook Dr, Townsville, Qld 4811, Australia
[3] Sugar Res Australia, Tully, Qld 4068, Australia
[4] Crop Sci Consulting, Townsville, Qld 4811, Australia
关键词
APSIM; Agriculture; Nitrogen; Fertilizer; Value chain; Random forest; SELECTION; CITIES;
D O I
10.1007/s13593-016-0364-z
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
Foreknowledge about sugarcane crop size can help industry members make more informed decisions. There exists many different combinations of climate variables, seasonal climate prediction indices, and crop model outputs that could prove useful in explaining sugarcane crop size. A data mining method like random forests can cope with generating a prediction model when the search space of predictor variables is large. Research that has investigated the accuracy of random forests to explain annual variation in sugarcane productivity and the suitability of predictor variables generated from crop models coupled with observed climate and seasonal climate prediction indices is limited. Simulated biomass from the APSIM (Agricultural Production Systems sIMulator) sugarcane crop model, seasonal climate prediction indices and observed rainfall, maximum and minimum temperature, and radiation were supplied as inputs to a random forest classifier and a random forest regression model to explain annual variation in regional sugarcane yields at Tully, in northeastern Australia. Prediction models were generated on 1 September in the year before harvest, and then on 1 January and 1 March in the year of harvest, which typically runs from June to November. Our results indicated that in 86.36 % of years, it was possible to determine as early as September in the year before harvest if production would be above the median. This accuracy improved to 95.45 % by January in the year of harvest. The R-squared of the random forest regression model gradually improved from 66.76 to 79.21 % from September in the year before harvest through to March in the same year of harvest. All three sets of variables-(i) simulated biomass indices, (ii) observed climate, and (iii) seasonal climate prediction indices-were typically featured in the models at various stages. Better crop predictions allows farmers to improve their nitrogen management to meet the demands of the new crop, mill managers could better plan the mill's labor requirements and maintenance scheduling activities, and marketers can more confidently manage the forward sale and storage of the crop. Hence, accurate yield forecasts can improve industry sustainability by delivering better environmental and economic outcomes.
引用
收藏
页数:9
相关论文
共 36 条
[11]   Advanced satellite imagery to classify sugarcane crop characteristics [J].
Everingham, Y. L. ;
Lowe, K. H. ;
Donald, D. A. ;
Coomans, D. H. ;
Markley, J. .
AGRONOMY FOR SUSTAINABLE DEVELOPMENT, 2007, 27 (02) :111-117
[12]   A Bayesian modelling approach for long lead sugarcane yield forecasts for the Australian sugar industry [J].
Everingham, Y. L. ;
Inman-Bamber, N. G. ;
Thorburn, P. J. ;
McNeill, T. J. .
AUSTRALIAN JOURNAL OF AGRICULTURAL RESEARCH, 2007, 58 (02) :87-94
[13]   Ensemble data mining approaches to forecast regional sugarcane crop production [J].
Everingham, Y. L. ;
Smyth, C. W. ;
Inman-Bamber, N. G. .
AGRICULTURAL AND FOREST METEOROLOGY, 2009, 149 (3-4) :689-696
[14]   Using southern oscillation index phases to forecast sugarcane yields: A case study for Northeastern Australia [J].
Everingham, YL ;
Muchow, RC ;
Stone, RC ;
Coomans, DH .
INTERNATIONAL JOURNAL OF CLIMATOLOGY, 2003, 23 (10) :1211-1218
[15]  
FAO, 2009, How to feed the world: 2050, DOI [10.1111/j.1728-4457.2009.00312.x, 10.5822/978-1-61091-885-5]
[16]   Random Forests modelling for the estimation of mango (Mangifera indica L. cv. Chok Anan) fruit yields under different irrigation regimes [J].
Fukuda, Shinji ;
Spreer, Wolfram ;
Yasunaga, Eriko ;
Yuge, Kozue ;
Sardsud, Vicha ;
Mueller, Joachim .
AGRICULTURAL WATER MANAGEMENT, 2013, 116 :142-150
[17]   A comparison of machine learning regression techniques for LiDAR-derived estimation of forest variables [J].
Garcia-Gutierrez, J. ;
Martinez-Alvarez, F. ;
Troncoso, A. ;
Riquelme, J. C. .
NEUROCOMPUTING, 2015, 167 :24-31
[18]   A comparative investigation of modern feature selection and classification approaches for the analysis of mass spectrometry data [J].
Gromski, Piotr S. ;
Xu, Yun ;
Correa, Elon ;
Ellis, David I. ;
Turner, Michael L. ;
Goodacre, Royston .
ANALYTICA CHIMICA ACTA, 2014, 829 :1-8
[19]   Digital mapping of soil organic matter for rubber plantation at regional scale: An application of random forest plus residuals kriging approach [J].
Guo, Peng-Tao ;
Li, Mao-Fen ;
Luo, Wei ;
Tang, Qun-Feng ;
Liu, Zhi-Wei ;
Lin, Zhao-Mu .
GEODERMA, 2015, 237 :49-59
[20]   Sucrose accumulation in sugarcane is influenced by temperature and genotype through the carbon source-sink balance [J].
Inman-Bamber, N. G. ;
Bonnett, G. D. ;
Spillman, M. F. ;
Hewitt, M. H. ;
Glassop, D. .
CROP & PASTURE SCIENCE, 2010, 61 (02) :111-121