Enhancing crop yield prediction in Senegal using advanced machine learning techniques and synthetic data

被引:1
|
作者
Razavi, Mohammad Amin [1 ]
Nejadhashemi, A. Pouyan [2 ]
Majidi, Babak [3 ]
Razavi, Hoda S. [2 ]
Kpodo, Josue [2 ,4 ]
Eeswaran, Rasu [2 ,5 ,6 ]
Ciampitti, Ignacio [6 ]
Prasad, P. V. Vara [7 ]
机构
[1] Univ Tehran, Sch Elect & Comp Engn, Tehran, Iran
[2] Michigan State Univ, Dept Biosyst & Agr Engn, E Lansing, MI 48824 USA
[3] Khatam Univ, Dept Comp Engn, Tehran, Iran
[4] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI USA
[5] Univ Jaffna, Fac Agr, Dept Agron, Kilinochchi, Sri Lanka
[6] Kansas State Univ, Dept Agron, Manhattan, KS USA
[7] Kansas State Univ, Feed Future Sustainable Intensificat Innovat Lab, Manhattan, KS USA
来源
ARTIFICIAL INTELLIGENCE IN AGRICULTURE | 2024年 / 14卷
基金
美国食品与农业研究所;
关键词
Crop yield prediction; Variational auto encoder; Pattern recognition on spatiotemporal and; physiographical variables; Synthetic tabular data generation; Ensemble learning; INTERPOLATION METHODS; CLIMATE-CHANGE; AGRICULTURE; MANAGEMENT; SYSTEMS;
D O I
10.1016/j.aiia.2024.11.005
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
In this study, we employ advanced data-driven techniques to investigate the complex relationships between the yields of five major crops and various geographical and spatiotemporal features in Senegal. We analyze how these features influence crop yields by utilizing remotely sensed data. Our methodology incorporates clustering algorithms and correlation matrix analysis to identify significant patterns and dependencies, offering a comprehensive understanding of the factors affecting agricultural productivity in Senegal. To optimize the model's performance and identify the optimal hyperparameters, we implemented a comprehensive grid search across four distinct machine learning regressors: Random Forest, Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and Light Gradient-Boosting Machine (LightGBM). Each regressor offers unique functionalities, enhancing our exploration of potential model configurations. The top-performing models were selected based on evaluating multiple performance metrics, ensuring robust and accurate predictive capabilities. The results demonstrated that XGBoost and CatBoost perform better than the other two. We introduce synthetic crop data generated using a Variational Auto Encoder to address the challenges posed by limited agricultural datasets. By achieving high similarity scores with real-world data, our synthetic samples enhance model robustness, mitigate overfitting, and provide a viable solution for small dataset issues in agriculture. Our approach distinguishes itself by creating a flexible model applicable to various crops together. By integrating five crop datasets and generating high-quality synthetic data, we improve model performance, reduce overfitting, and enhance realism. Our findings provide crucial insights for productivity drivers in key cropping systems, enabling robust recommendations and strengthening the decision-making capabilities of policymakers and farmers in datascarce regions. (c) 2024 The Authors. Publishing services by Elsevier B.V. on behalf of KeAi Communications Co., Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:99 / 114
页数:16
相关论文
共 50 条
  • [31] Enhancing Rice Production Prediction in Indonesia Using Advanced Machine Learning Models
    Erlin
    Yunianta, Arda
    Wulandhari, Lili Ayu
    Desnelita, Yenny
    Nasution, Nurliana
    Junadhi
    IEEE ACCESS, 2024, 12 : 151161 - 151177
  • [32] An Ensemble Machine Learning Framework for Cotton Crop Yield Prediction Using Weather Parameters: A Case Study of Pakistan
    Haider, Syed Tahseen
    Ge, Wenping
    Li, Jianqiang
    Rehman, Saif Ur
    Imran, Azhar
    Sharaf, Mohamed Abdel Fattah
    Haider, Syed Muhammad
    IEEE ACCESS, 2024, 12 : 124045 - 124061
  • [33] Sorghum Yield Prediction using Machine Learning
    Zannou, Judicael Geraud N.
    Houndji, Vinasetan Ratheil
    2019 3RD INTERNATIONAL CONFERENCE ON BIO-ENGINEERING FOR SMART TECHNOLOGIES (BIOSMART), 2019,
  • [34] Multimodal Machine Learning Based Crop Recommendation and Yield Prediction Model
    Gopi, P. S. S.
    Karthikeyan, M.
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (01) : 313 - 326
  • [35] Optimal county-level crop yield prediction using MODIS-based variables and weather data: A comparative study on machine learning models
    Ju, Sungha
    Lim, Hyoungjoon
    Ma, Jong Won
    Kim, Soohyun
    Lee, Kyungdo
    Zhao, Shuhe
    Heo, Joon
    AGRICULTURAL AND FOREST METEOROLOGY, 2021, 307
  • [36] County-scale crop yield prediction by integrating crop simulation with machine learning models
    Sajid, Saiara Samira
    Shahhosseini, Mohsen
    Huber, Isaiah
    Hu, Guiping
    Archontoulis, Sotirios, V
    FRONTIERS IN PLANT SCIENCE, 2022, 13
  • [37] Comparative Analysis of Machine Learning Models for Crop Yield Prediction Across Multiple Crop Types
    Yashraj Patil
    Harikrishnan Ramachandran
    Sridhevi Sundararajan
    P. Srideviponmalar
    SN Computer Science, 6 (1)
  • [38] Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt
    Shahhosseini, Mohsen
    Hu, Guiping
    Huber, Isaiah
    Archontoulis, Sotirios V.
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [39] Comparative analysis of machine learning techniques for predicting production capability of crop yield
    Kalpana Jain
    Naveen Choudhary
    International Journal of System Assurance Engineering and Management, 2022, 13 : 583 - 593
  • [40] Comparative analysis of machine learning techniques for predicting production capability of crop yield
    Jain, Kalpana
    Choudhary, Naveen
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2022, 13 (SUPPL 1) : 583 - 593