Evaluating machine learning models and identifying key factors influencing spatial maize yield predictions in data intensive farm management

被引:3
作者
Maseko, S. [1 ]
van der Laan, M. [1 ,2 ]
Tesfamariam, E. H. [1 ]
Delport, M. [3 ]
Otterman, H. [3 ]
机构
[1] Univ Pretoria, Dept Plant & Soil Sci, Private Bag X20, ZA-0028 Pretoria, South Africa
[2] Agr Res Council Nat Resources & Engn, Private Bag X79, ZA-0001 Pretoria, South Africa
[3] Bur Food & Agr Policy BFAP, Agri Hub Off Pk, Pretoria, South Africa
基金
新加坡国家研究基金会;
关键词
Precision agriculture; Random Forest; Yield limiting factors; Yield variability; CROP YIELD; SOIL; CORN; TOPOGRAPHY; BIOMASS;
D O I
10.1016/j.eja.2024.127193
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
Understanding the relationships between crop yields, soil properties, weather patterns and input applications is important for optimizing agricultural production. Data variation analysis using statistical and machine learning (ML) approaches can help identify and understand the practices that optimize yield. The objectives of this study were (i) to evaluate the predictive accuracy of selected ML models for estimating grain yields in on-farm maize (Zea mays L.) trials with different combinations of seeding and fertilizer rates in a commercial field, and (ii) to investigate the ability of ML models to assist in identifying yield-limiting factors in the same field. Multiple linear regression, multilayer perceptron, decision tree, and random forest (RF) ML models were trained and tested using crop management and soil from a data-intensive farm management (DIFM) trial and remotely sensed data. The dataset consisted of multiple subplot treatment observations of crop management, soil properties and normalized difference vegetation index (NDVI), linked to final grain yield for the 2019/2020 and 2020/2021 seasons. The RF had the best combination of high correlation (R2 = 0.69 and 0.80) and low error (MAPE = 5.4 and 8.4% and RMSE = 0.69 and 0.95 t ha-1) when compared to other models for both seasons. Feature importance analysis revealed that urea application was consistently the most critical variable and explained yield variations to the greatest extent, whereas soil phosphorus (P), plant population, and sodium in 2020, and soil P, soil pH, clay content, and plant population in 2021 emerged as the most influential factors for explaining yields. This study concluded that the RF model was the best for spatial yield predictions using DIFM trial datasets. There was also variability between seasons in yield limiting factors resulting from temporal variations in growing conditions. To effectively apply insights from yield prediction models, it is crucial that the variables incorporated into these models have a significant connection to yield and the findings can be translated into actionable management decisions. The DIFM trials combined with ML can play an important role in advancing the field of precision agriculture by providing valuable insights into the complex interactions between crops, soils, and management practices, and identifying new opportunities for improving crop yields and environmental sustainability.
引用
收藏
页数:11
相关论文
共 44 条
  • [21] Lacerda L.N., 2022, Identifying Key Factors Influencing Yield Spatial Pattern and Temporal Stability for Management Zone Delineation
  • [22] Expert Insights on the Impacts of, and Potential for, Agricultural Big Data
    Lassoued, Rim
    Macall, Diego M.
    Smyth, Stuart J.
    Phillips, Peter W. B.
    Hesseln, Hayley
    [J]. SUSTAINABILITY, 2021, 13 (05) : 1 - 18
  • [23] Li C., 2020, J. Agric. Sci. Technol. B, V10, P195, DOI [10.17265/2161-6264/2020.04.001, DOI 10.17265/2161-6264/2020.04.001]
  • [24] Toward Automated Machine Learning-Based Hyperspectral Image Analysis in Crop Yield and Biomass Estimation
    Li, Kai-Yun
    Sampaio de Lima, Raul
    Burnside, Niall G.
    Vahtmaee, Ele
    Kutser, Tiit
    Sepp, Karli
    Cabral Pinheiro, Victor Henrique
    Yang, Ming-Der
    Vain, Ants
    Sepp, Kalev
    [J]. REMOTE SENSING, 2022, 14 (05)
  • [25] Timing and rates of nitrogen fertiliser application on seed yield, quality and nitrogen-use efficiency of canola
    Ma, B. L.
    Herath, A. W.
    [J]. CROP & PASTURE SCIENCE, 2016, 67 (02) : 167 - 180
  • [26] Maimon Oded Z, 2014, Data mining with decision trees: theory and applications, V81
  • [27] Naser MZ, 2020, Arxiv, DOI arXiv:2006.00887
  • [28] Estimating Biomass and Nitrogen Amount of Barley and Grass Using UAV and Aircraft Based Spectral and Photogrammetric 3D Features
    Nasi, Roope
    Viljanen, Niko
    Kaivosoja, Jere
    Alhonoja, Katja
    Hakala, Teemu
    Markelin, Lauri
    Honkavaara, Eija
    [J]. REMOTE SENSING, 2018, 10 (07)
  • [29] Delineation of Soil Management Zones for Variable-Rate Fertilization: A Review
    Nawar, Said
    Corstanje, Ronald
    Halcro, Graham
    Mulla, David
    Mouazen, Abdul M.
    [J]. ADVANCES IN AGRONOMY, VOL 143, 2017, 143 : 175 - 245
  • [30] Interpretable machine learning methods to explain on-farm yield variability of high productivity wheat in Northwest India
    Nayak, Hari Sankar
    Silva, Joao Vasco
    Parihar, Chiter Mal
    Krupnik, Timothy J.
    Sena, Dipaka Ranjan
    Kakraliya, Suresh K.
    Jat, Hanuman Sahay
    Sidhu, Harminder Singh
    Sharma, Parbodh C.
    Jat, Mangi Lal
    Sapkota, Tek B.
    [J]. FIELD CROPS RESEARCH, 2022, 287