Using Machine Learning and Feature Selection for Alfalfa Yield Prediction

被引:31
|
作者
Whitmire, Christopher D. D. [1 ]
Vance, Jonathan M. M. [2 ]
Rasheed, Hend K. K.
Missaoui, Ali [3 ]
Rasheed, Khaled M. M. [1 ,2 ]
Maier, Frederick W. W. [1 ]
机构
[1] Univ Georgia, Inst Artificial Intelligence, 515 Boyd Grad Studies,200 DW Brooks Dr, Athens, GA 30602 USA
[2] Univ Georgia, Dept Comp Sci, 415 Boyd Grad Studies,200 D W Brooks Dr, Athens, GA 30602 USA
[3] Univ Georgia, Inst Plant Breeding Genet & Genom, Dept Crop & Soil Sci, 4317 Miller Plant Sci, Athens, GA 30602 USA
关键词
alfalfa; cross validation; feature selection; machine learning; regression; yield prediction;
D O I
10.3390/ai2010006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Predicting alfalfa biomass and crop yield for livestock feed is important to the daily lives of virtually everyone, and many features of data from this domain combined with corresponding weather data can be used to train machine learning models for yield prediction. In this work, we used yield data of different alfalfa varieties from multiple years in Kentucky and Georgia, and we compared the impact of different feature selection methods on machine learning (ML) models trained to predict alfalfa yield. Linear regression, regression trees, support vector machines, neural networks, Bayesian regression, and nearest neighbors were all developed with cross validation. The features used included weather data, historical yield data, and the sown date. The feature selection methods that were compared included a correlation-based method, the ReliefF method, and a wrapper method. We found that the best method was the correlation-based method, and the feature set it found consisted of the Julian day of the harvest, the number of days between the sown and harvest dates, cumulative solar radiation since the previous harvest, and cumulative rainfall since the previous harvest. Using these features, the k-nearest neighbor and random forest methods achieved an average R value over 0.95, and average mean absolute error less than 200 lbs./acre. Our top R-2 of 0.90 beats a previous work's best R-2 of 0.87. Our primary contribution is the demonstration that ML, with feature selection, shows promise in predicting crop yields even on simple datasets with a handful of features, and that reporting accuracies in R and R-2 offers an intuitive way to compare results among various crops.
引用
收藏
页码:71 / 88
页数:18
相关论文
共 50 条
  • [41] Feature selection for effective prediction of SARS-COV-2 using machine learning
    Gagan Punacha
    Rama Adiga
    Genes & Genomics, 2024, 46 : 341 - 354
  • [42] Enhancing Parkinson's Disease Prediction Using Machine Learning and Feature Selection Methods
    Saeed, Faisal
    Al-Sarem, Mohammad
    Al-Mohaimeed, Muhannad
    Emara, Abdelhamid
    Boulila, Wadii
    Alasli, Mohammed
    Ghabban, Fahad
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (03): : 5639 - 5657
  • [43] Data-Driven Diabetes Risk Factor Prediction Using Machine Learning Algorithms with Feature Selection Technique
    Kakoly, Israt Jahan
    Hoque, Md. Rakibul
    Hasan, Najmul
    SUSTAINABILITY, 2023, 15 (06)
  • [44] Alfalfa Yield Prediction Using UAV-Based Hyperspectral Imagery and Ensemble Learning
    Feng, Luwei
    Zhang, Zhou
    Ma, Yuchi
    Du, Qingyun
    Williams, Parker
    Drewry, Jessica
    Luck, Brian
    REMOTE SENSING, 2020, 12 (12)
  • [45] FeatureSelect: a software for feature selection based on machine learning approaches
    Masoudi-Sobhanzadeh, Yosef
    Motieghader, Habib
    Masoudi-Nejad, Ali
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [46] Review on intrusion detection using feature selection with machine learning techniques
    Kalimuthan, C.
    Renjit, J. Arokia
    MATERIALS TODAY-PROCEEDINGS, 2020, 33 : 3794 - 3802
  • [47] FeatureSelect: a software for feature selection based on machine learning approaches
    Yosef Masoudi-Sobhanzadeh
    Habib Motieghader
    Ali Masoudi-Nejad
    BMC Bioinformatics, 20
  • [48] An integrated feature selection and machine learning framework for PM10 concentration prediction
    Kalantari, Elham
    Gholami, Hamid
    Malakooti, Hossein
    Kaskaoutis, Dimitris G.
    Saneei, Poorya
    ATMOSPHERIC POLLUTION RESEARCH, 2025, 16 (05)
  • [49] Prediction of intrapartum fetal hypoxia considering feature selection algorithms and machine learning models
    Zafer Cömert
    Abdulkadir Şengür
    Ümit Budak
    Adnan Fatih Kocamaz
    Health Information Science and Systems, 7
  • [50] Feature selection and response prediction on a suspension bridge due to wind effect by machine learning
    Afshar, Aref
    Nouri, Gholamreza
    Lavassani, Seyed Hossein Hosseini
    Doroudi, Rouzbeh
    Farsangi, Ehsan Noroozinejad
    STRUCTURES, 2025, 71