A methods guideline for deep learning for tabular data in agriculture with a case study to forecast cereal yield

被引:23
作者
Richetti, Jonathan [1 ]
Diakogianis, Foivos I. [2 ]
Bender, Asher [3 ]
Colaco, Andre F. [4 ]
Lawes, Roger A. [1 ]
机构
[1] CSIRO, 147 Underwood Ave, Floreat, WA 6014, Australia
[2] Data61, CSIRO, 26 Dick Perry Ave, Kensington, WA 6151, Australia
[3] Univ Sydney, Australian Ctr Field Robot, Camperdown, NSW 2006, Australia
[4] CSIRO, Waite Campus,Locked Bag 2, Glen Osmond, SA 5064, Australia
关键词
Machine learning; Artificial neural network; Multi -layer perceptron; Random forest; Xgboost; Tabnet; Wheat; Barley; NEURAL-NETWORKS; NITROGEN; PREDICTION; MODELS;
D O I
10.1016/j.compag.2023.107642
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Machine learning (ML) and its branch, deep learning (DL), is rapidly evolving and gaining popularity as it outperforms other, more traditional methods in different areas of agriculture. However, ML and DL techniques must be correctly applied to a problem to produce an acceptable solution. This article provides guidelines for using DL techniques with a case study using different models/methods to forecast yields in cereals; some of the concepts presented here are also applicable to ML more broadly. The objective is to provide clarity for new users around the use of DL techniques to solve agronomic problems. DL concepts are introduced; best practices for data pre-processing steps and metrics are recommended. Cross-validation is clarified, and its importance is high-lighted. It is shown that DL performance can vary with architecture and that the optimal choice is task -dependent. Emphasis on practical aspects for applying DL models for agricultural datasets is provided, such as dataset size (26 representative samples in each field sufficed) and cross-validation (indispensable on small datasets). Lastly, a standard guideline for DL applied to tabular data is recommended.
引用
收藏
页数:12
相关论文
共 47 条
[1]  
[Anonymous], Automatic differentiation in PyTorch
[2]  
Arik SO, 2021, AAAI CONF ARTIF INTE, V35, P6679
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]  
Bubeck S., 2021, UNIVERSAL LAW ROBUST
[5]  
Çakir Y, 2014, INT CONF AGRO-GEOINF, P212
[6]   Multitask learning [J].
Caruana, R .
MACHINE LEARNING, 1997, 28 (01) :41-75
[7]   A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines [J].
Charte, David ;
Charte, Francisco ;
Garcia, Salvador ;
del Jesus, Maria J. ;
Herrera, Francisco .
INFORMATION FUSION, 2018, 44 :78-96
[8]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[9]   Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: A review [J].
Chlingaryan, Anna ;
Sukkarieh, Salah ;
Whelan, Brett .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2018, 151 :61-69
[10]   How will the next-generation of sensor-based decision systems look in the context of intelligent agriculture? A case-study [J].
Colaco, A. F. ;
Richetti, J. ;
Bramley, R. G. V. ;
Lawes, R. A. .
FIELD CROPS RESEARCH, 2021, 270