Machine learning methods in near infrared spectroscopy for predicting sensory traits in sweetpotatoes

被引:6
作者
Nantongo, Judith Ssali [1 ]
Serunkuma, Edwin [1 ]
Burgos, Gabriela [2 ]
Nakitto, Mariam [1 ]
Davrieux, Fabrice [3 ]
Ssali, Reuben [1 ]
机构
[1] Int Potato Ctr, Ntinda 2 Rd,Plot 47,POB 22274, Kampala, Uganda
[2] Int Potato Ctr, Lima, Peru
[3] CIRAD, UMR Qualisud, F-34398 Montpellier, France
关键词
High throughput; Breeding efficiency; Consumer preferences; Food security; NIR SPECTROSCOPY; ELASTIC NET; REGRESSION; PLS; SELECTION; LIKING;
D O I
10.1016/j.saa.2024.124406
中图分类号
O433 [光谱学];
学科分类号
0703 ; 070302 ;
摘要
It has been established that near infrared (NIR) spectroscopy has the potential of estimating sensory traits given the direct spectral responses that these properties have in the NIR region. In sweetpotato, sensory and texture traits are key for improving acceptability of the crop for food security and nutrition. Studies have statistically modelled the levels of NIR spectroscopy sensory characteristics using partial least squares (PLS) regression methods. To improve prediction accuracy, there are many advanced techniques, which could enhance modelling of fresh (wet and un-processed) samples or nonlinear dependence relationships. Performance of different quantitative prediction models for sensory traits developed using different machine learning methods were compared. Overall, results show that linear methods; linear support vector machine (L-SVM), principal component regression (PCR) and PLS exhibited higher mean R2 values than other statistical methods. For all the 27 sensory traits, calibration models using L-SVM and PCR has slightly higher overall R2 (x = 0.33) compared to PLS (x = 0.32) and radial-based SVM (NL-SVM; x= 0.30). The levels of orange color intensity were the best predicted by all the calibration models (R2 = 0.87 - 0.89). The elastic net linear regression (ENR) and tree-based methods; extreme gradient boost (XGBoost) and random forest (RF) performed worse than would be expected but could possibly be improved with increased sample size. Lower average R2 values were observed for calibration models of ENR (x = 0.26), XGBoost (x = 0.26) and RF (x = 0.22). The overall RMSE in calibration models was lower in PCR models (X = 0.82) compared to L-SVM (x = 0.86) and PLS (x = 0.90). ENR, XGBoost and RF also had higher RMSE (x = 0.90 - 0.92). Effective wavelengths selection using the interval partial least-squares regression (iPLS), improved the performance of the models but did not perform as good as the PLS. SNV pretreatment was useful in improving model performance.
引用
收藏
页数:8
相关论文
共 51 条
  • [1] Measurement of Soluble Solid Contents and pH of White Vinegars Using VIS/NIR Spectroscopy and Least Squares Support Vector Machine
    Bao, Yidan
    Liu, Fei
    Kong, Wenwen
    Sun, Da-Wen
    He, Yong
    Qiu, Zhengjun
    [J]. FOOD AND BIOPROCESS TECHNOLOGY, 2014, 7 (01) : 54 - 61
  • [2] Bian X., 2022, Chemometric Methods in Analytical Spectroscopy Technology, P111, DOI DOI 10.1007/978-981-19-1625-04
  • [3] A modified random forest approach to improve multi-class classification performance of tobacco leaf grades coupled with NIR spectroscopy
    Bin, Jun
    Ai, Fang-Fang
    Fan, Wei
    Zhou, Ji-Heng
    Yun, Yong-Huan
    Liang, Yi-Zeng
    [J]. RSC ADVANCES, 2016, 6 (36) : 30353 - 30361
  • [4] NIR associated to PLS and SVM for fast and non-destructive determination of C, N, P, and K contents in poultry litter
    Borsatti Bedin, Flavia Chiamulera
    Faust, Mateus Vinicius
    Guarneri, Giovanni Alfredo
    Assmann, Tangriani Simioni
    Batista Lafay, Cintia Boeira
    Soares, Lisiane Fernandes
    Victoria de Oliveira, Paulo Armando
    dos Santos-Tonial, Larissa Macedo
    [J]. SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2021, 245
  • [5] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [6] Non-invasive identification of commercial green tea blends using NIR spectroscopy and support vector machine
    Cardoso, Victor Gustavo Kelis
    Poppi, Ronei Jesus
    [J]. MICROCHEMICAL JOURNAL, 2021, 164
  • [7] High-Throughput Field-Phenotyping Tools for Plant Breeding and Precision Agriculture
    Chawade, Aakash
    van Ham, Joost
    Blomquist, Hanna
    Bagge, Oscar
    Alexandersson, Erik
    Ortiz, Rodomiro
    [J]. AGRONOMY-BASEL, 2019, 9 (05):
  • [8] Determination of total polyphenols content in green tea using FT-NIR spectroscopy and different PLS algorithms
    Chen, Quansheng
    Zhao, Jiewen
    Liu, Muhua
    Cai, Jianrong
    Liu, Jianhua
    [J]. JOURNAL OF PHARMACEUTICAL AND BIOMEDICAL ANALYSIS, 2008, 46 (03) : 568 - 573
  • [9] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [10] Chia Kim Seng, 2022, 2022 International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)., P123, DOI 10.1109/IDITR54676.2022.9796490