共 67 条
Overinterpretation of evaluation results in machine learning studies for maize yield prediction: A systematic review
被引:0
作者:
Leukel, Joerg
[1
]
Scheurer, Luca
[1
]
Zimpel, Tobias
[1
]
机构:
[1] Univ Hohenheim, Dept Informat Syst & Digital Technol, Schwerzstr 35, D-70599 Stuttgart, Germany
关键词:
Artificial intelligence;
Prediction model;
Reporting quality;
Spin;
Yield prediction;
ABSOLUTE ERROR MAE;
CROP YIELD;
MODEL;
RMSE;
D O I:
10.1016/j.compag.2024.109892
中图分类号:
S [农业科学];
学科分类号:
09 ;
摘要:
An increasing number of studies are focused on developing and evaluating machine learning (ML) models for crop yield prediction. However, poor reporting and intentional or unintentional misuse of language can distort the interpretation of evaluation results and make them appear more favorably than they actually are. The prevalence of overinterpretation of evaluation results is not yet known. To address this gap, we determined the frequency of overinterpretation in different sections of articles evaluating prediction models for maize as a specific crop. We conducted a prospectively registered systematic review, searching Scopus and Web of Science for journal articles reporting the evaluation of ML prediction models for maize yield, published from 2014 and 2023. We evaluated 207 articles based on reporting criteria adapted from overinterpretation frameworks in the biomedical literature. Overly positive words were used to describe evaluation results in 58 conclusion sections (28.0%) and 52 abstracts (25.1%). Fifty-two articles (25.1%) reported results for the coefficient of determination (R2) inconsistent between abstract and main text. Forty-eight articles (23.2%) incorrectly interpreted correlation coefficients as a measure of performance. Thirty articles (14.5%) reported wrong formulas for at least one of five standard metrics. In 25 articles (12.1%), at least one metric lacked or had an incorrect unit of measurement. In each 21 articles (10.1%), the reporting of the normalized root mean square error (NRSME) was inconsistent between abstract and main text, and R2 was wrongly interpreted as correlation. In summary, overinterpretation of evaluation results in ML studies for maize yield prediction was a frequently observed phenomenon. The findings have important implications for both the interpretation of individual studies and the broader accumulation of knowledge. Authors, reviewers, and editorial boards should put more emphasis on a complete, unambiguous, and correct reporting of evaluation results for which we submit recommendations.
引用
收藏
页数:9
相关论文