A comparison of 4 different machine learning algorithms to predict lactoferrin content in bovine milk from mid-infrared spectra

被引:36
|
作者
Soyeurt, H. [1 ]
Grelet, C. [2 ]
McParland, S. [3 ]
Calmels, M. [4 ]
Coffey, M. [5 ]
Tedde, A. [1 ]
Delhez, P. [1 ,6 ]
Dehareng, F. [2 ]
Gengler, N. [1 ]
机构
[1] Univ Liege, TERRA Res & Teaching Ctr, Gembloux Agrobio Tech, Gembloux, Belgium
[2] Walloon Res Ctr, Valorisat Agr Prod, Gembloux, Belgium
[3] TEAGASC, Anim & Grassland Res & Innovat Ctr, Moorepk, Fermoy, Cork, Ireland
[4] Seenovia, Res & Dev, St Berthevin, France
[5] Scotlands Rural Coll, Livestock Breeding Anim & Vet Sci, Edinburgh, Midlothian, Scotland
[6] Natl Fund Sci Res, Brussels, Belgium
基金
爱尔兰科学基金会; 英国生物技术与生命科学研究理事会;
关键词
milk; lactoferrin; mid infrared; machine learning; BETA-HYDROXYBUTYRATE; INFRARED-SPECTRUM; DAIRY-COWS; STAGE; REGRESSION; MASTITIS; BREEDS;
D O I
10.3168/jds.2020-18870
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
Lactoferrin (LF) is a glycoprotein naturally present in milk. Its content varies throughout lactation, but also with mastitis; therefore it is a potential additional indicator of udder health beyond somatic cell count. Condequently, there is an interest in quantifying this biomolecule routinely. First prediction equations proposed in the literature to predict the content in milk using milk mid-infrared spectrometry were built using partial least square regression (PLSR) due to the limited size of the data set. Thanks to a large data set, the current study aimed to test 4 different machine learning algorithms using a large data set comprising 6,619 records collected across different herds, breeds, and countries. The first algorithm was a PLSR, as used in past investigations. The second and third algorithms used partial least square (PLS) factors combined with a linear and polynomial support vector regression (PLS + SVR). The fourth algorithm also used PLS factors, but included in an artificial neural network with 1 hidden layer (PLS + ANN). The training and validation sets comprised 5,541 and 836 records, respectively. Even if the calibration prediction performances were the best for PLS + polynomial SVR, their validation prediction performances were the worst. The 3 other algorithms had similar validation performances. Indeed, the validation root mean squared error (RMSE) ranged between 162.17 and 166.75 mg/L of milk. However, the lower standard deviation of cross-validation RMSE and the better normality of the residual distribution observed for PLS + ANN suggest that this modeling was more suitable to predict the LF content in milk from milk mid-infrared spectra (R(2)v = 0.60 and validation RMSE = 162.17 mg/L of milk). This PLS +ANN model was then applied to almost 6 million spectral records. The predicted LF showed the expected relationships with milk yield, somatic cell score, somatic cell count, and stage of lactation. The model tended to underestimate high LF values (higher than 600 rng/L of milk). However, if the prediction threshold was set to 500 mg/L, 82% of samples from the validation having a content of LF higher than 600 mg/L were detected. Future research should aim to increase the number of those extremely high LF records in the calibration set.
引用
收藏
页码:11585 / 11596
页数:12
相关论文
共 46 条
  • [41] Utilising fatty acid prediction equations and mid-infrared spectroscopy to estimate the seasonality changes in bovine milk from pasture-based systems
    Timlin, Mark
    O'Callaghan, Tom F.
    Mccarthy, Elaine K.
    Lynch, Michael
    Sheehan, Barry
    Mccarthy, Noel A.
    Frizzarin, Maria
    INTERNATIONAL JOURNAL OF DAIRY TECHNOLOGY, 2024, 77 (04) : 1203 - 1214
  • [42] Comparison of Machine Learning Tree-Based Algorithms to Predict Future Paratuberculosis ELISA Results Using Repeat Milk Tests
    Imada, Jamie
    Arango-Sabogal, Juan Carlos
    Bauman, Cathy
    Roche, Steven
    Kelton, David
    ANIMALS, 2024, 14 (07):
  • [43] Persistence of differences between dairy cows categorized as low or high methane emitters, as estimated from milk mid-infrared spectra and measured by GreenFeed
    Denninger, T. M.
    Dohme-Meier, F.
    Eggerschwiler, L.
    Vanlierde, A.
    Grandl, F.
    Gredler, B.
    Kreuzer, M.
    Schwarm, A.
    Muenger, A.
    JOURNAL OF DAIRY SCIENCE, 2019, 102 (12) : 11751 - 11765
  • [44] Genetic parameters for cheese-making properties and milk composition predicted from mid-infrared spectra in a large data set of Montbeliarde cows
    Sanchez, M. P.
    El Jabri, M.
    Minery, S.
    Wolf, V
    Beuvier, E.
    Laithier, C.
    Delacroix-Buchet, A.
    Brochard, M.
    Boichard, D.
    JOURNAL OF DAIRY SCIENCE, 2018, 101 (11) : 10048 - 10061
  • [45] Interpretability Versus Accuracy: A Comparison of Machine Learning Models Built Using Different Algorithms, Performance Measures, and Features to Predict E. coli Levels in Agricultural Water
    Weller, Daniel L.
    Love, Tanzy M. T.
    Wiedmann, Martin
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [46] Mining data from milk mid-infrared spectroscopy and animal characteristics to improve the prediction of dairy cow?s liveweight using feature selection algorithms based on partial least squares and Elastic Net regressions
    Zhang, Lei
    Tedde, Anthony
    Phuong Ho
    Grelet, Clement
    Dehareng, Frederic
    Froidmont, Eric
    Gengler, Nicolas
    Brostaux, Yves
    Hailemariam, Dagnachew
    Pryce, Jennie
    Soyeurt, Helene
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 184