Crop Classification and Yield Prediction Using Robust Machine Learning Models for Agricultural Sustainability

被引:0
作者
Badshah, Abid [1 ]
Alkazemi, Basem Yousef [2 ]
Din, Fakhrud [1 ]
Zamli, Kamal Z. [3 ,4 ]
Haris, Muhammad [4 ]
机构
[1] Univ Malakand, Dept Comp Sci & IT, Fac Informat Technol IT, Chakdara 18800, Khyber Pakhtunk, Pakistan
[2] Umm Al Qura Univ, Coll Comp, Dept Software Engn, Mecca 24382, Saudi Arabia
[3] Univ Malaysia Pahang Al Sultan Abdullah UMPSA, Fac Comp, Kuantan 26600, Pahang, Malaysia
[4] Univ Airlangga, Fac Sci & Technol, C Campus JI Dr H Soekamo, Surabaya 60115, Indonesia
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Crops; Machine learning; Biological system modeling; Predictive models; Data models; Soil; Mathematical models; Production; Agriculture; Accuracy; Agricultural planning; crop recommendation; crop yield forecasting; explainable AI; K-fold cross-validation; machine learning;
D O I
10.1109/ACCESS.2024.3486653
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Agriculture is pivotal for the economy of a country as it is a major source of food, employment and raw materials. However, challenges such as diseases, soil degradation, and water scarcity persist. Technology adoption can address these issues, improving production and quality. Machine learning, a subset of Artificial Intelligence (AI), enables prediction, classification, and automation in agriculture. It optimizes irrigation, fertilization, and crop selection, aiding decision-making for food security and crop management. This study proposes two robust machine learning architectures for classification and regression based on distinct datasets. Firstly, we delve into a crop recommendation dataset obtained from Kaggle, consisting of various input attributes such as the pH of the soil, temperature, humidity, and nutrient levels. Leveraging machine learning classification techniques such as Extra Tree Classifier (ETC), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), K-Nearest Neighbour (KNN), Gaussian Naive Bayes (GNB), and Support Vector Machine (SVM), we suggest twenty-two different crops founded on these inputs. Through the use of K-fold cross-validation, Explainable AI (XAI) and feature engineering, we identify the best-performing model, with Random Forest coming out on top scoring an accuracy of 99.7% with precision, recall, F1 score, and confusion matrix. Secondly, we investigate wheat yield prediction data snagged from the World Bank and Food and Agriculture Organization (FAO), covering the years 1992-2013 for Pakistan. Using Multivariate Imputation by Chained Equations (MICE) to tackle data restrictions, we gauge wheat production for 2014-2024 and forecast the 2025 yield using machine learning regression models. Once again, using hyper parameter tuning with K-fold cross-validation, Support Vector Regressor (SVR) stands out as the top-performing model, achieving an accuracy of 99.9% with R-2 Score. Transparency and confidence in agricultural decision-making are increased when machine learning decisions are made comprehensible using Explainable AI (XAI) approaches. Two widely used XAI approaches, namely Feature Importance and Local Interpretable Model-Agnostic Explanations (LIME) are used to interpret and explain outcomes of the proposed models. The study can increase agricultural productivity, minimize risks, enhance food security, and promote more environmentally friendly farming approaches.
引用
收藏
页码:162799 / 162813
页数:15
相关论文
共 29 条
  • [1] Ahmed S., 2023, J. Agricult. Informat., V14, P22
  • [2] Al-Yaari M., 2022, Irrigation Sci, V40, P245
  • [3] Banerjee A., 2023, Comput. Electron. Agricult, V199
  • [4] Chen Y., 2023, Field Crops Res, V300
  • [5] DFN-PSAN: Multi-level deep information feature fusion extraction network for interpretable plant disease classification
    Dai, Guowei
    Tian, Zhimin
    Fan, Jingchao
    Sunil, C. K.
    Dewi, Christine
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 216
  • [6] ITF-WPI: Image and text based cross-modal feature fusion model for wolfberry pest recognition
    Dai, Guowei
    Fan, Jingchao
    Dewi, Christine
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 212
  • [7] PPLC-Net:Neural network-based plant disease identification model supported by weather data augmentation and multi-level attention mechanism
    Dai, Guowei
    Fan, Jingchao
    Tian, Zhimin
    Wang, Chaoyu
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (05)
  • [8] Dighe D., 2018, Int. Res. J. Eng. Technol., V5, P476
  • [9] Food Agricult. Org. (FAO) World Bank Kaggle., Crop Yield Prediction Dataset, Data Sources, and Agricultural Information
  • [10] Gao X., 2022, Comput. Electron. Agricult, V193