Shortcomings in the Evaluation of Blood Glucose Forecasting

被引:0
作者
Lee, Jung Min [1 ]
Pop-Busui, Rodica [2 ,3 ]
Lee, Joyce M. [4 ]
Fleischer, Jesper [5 ,6 ]
Wiens, Jenna [7 ]
机构
[1] Univ Michigan, Div Comp Sci & Engn, Ann Arbor, MI USA
[2] Univ Michigan, Dept Internal Med, Div Metab Endocrinol & Diabet, Ann Arbor, MI USA
[3] Oregon Hlth & Sci Univ, Harold Schnitzer Diabet Ctr, Div Endocrinol Diabet & Clin Nutr, Portland, OR USA
[4] Univ Michigan, Susan B Meister Child Hlth Evaluat & Res Ctr, Div Pediat Endocrinol, Ann Arbor, MI USA
[5] Steno Diabet Ctr Aarhus, Aarhus, Denmark
[6] Steno Diabet Ctr Zealand, Herlev, Denmark
[7] Univ Michigan, Div Comp Sci & Engn, Ann Arbor, MI 48109 USA
关键词
Predictive models; Insulin; Forecasting; Glucose; Training; Long short term memory; Prediction algorithms; Artificial intelligence; artificial pancreas; blood glucose forecasting; closed-loop systems; forecasting evaluation; machine learning; ARTIFICIAL PANCREAS;
D O I
10.1109/TBME.2024.3424665
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective: Recent years have seen an increase in machine learning (ML)-based blood glucose (BG) forecasting models, with a growing emphasis on potential application to hybrid or closed-loop predictive glucose controllers. However, current approaches focus on evaluating the accuracy of these models using benchmark data generated under the behavior policy, which may differ significantly from the data the model may encounter in a control setting. This study challenges the efficacy of such evaluation approaches, demonstrating that they can fail to accurately capture an ML-based model's true performance in closed-loop control settings. Methods: Forecast error measured using current evaluation approaches was compared to the control performance of two forecasters-a machine learning-based model (LSTM) and a rule-based model (Loop)-in silico when the forecasters were utilized with a model-based controller in a hybrid closed-loop setting. Results: Under current evaluation standards, LSTM achieves a significantly lower (better) forecast error than Loop with a root mean squared error (RMSE) of 11.57 +/- 0.05 mg/dL vs. 18.46 +/- 0.07 mg/dL at the 30-minute prediction horizon. Yet in a control setting, LSTM led to significantly worse control performance with only 77.14% (IQR 66.57-84.03) time-in-range compared to 86.20% (IQR 78.28-91.21) for Loop. Conclusion: Prevailing evaluation methods can fail to accurately capture the forecaster's performance when utilized in closed-loop settings. Significance: Our findings underscore the limitations of current evaluation standards and the need for alternative evaluation metrics and training strategies when developing BG forecasters for closed-loop control systems.
引用
收藏
页码:3424 / 3431
页数:8
相关论文
共 41 条
[1]   Standardizing Clinically Meaningful Outcome Measures Beyond HbA1c for Type 1 Diabetes: A Consensus Report of the American Association of Clinical Endocrinologists, the American Association of Diabetes Educators, the American Diabetes Association, the Endocrine Society, JDRF International, The Leona M. and Harry B. Helmsley Charitable Trust, the Pediatric Endocrine Society, and the T1D Exchange [J].
Agiostratidou, Gina ;
Anhalt, Henry ;
Ball, Dana ;
Blonde, Lawrence ;
Gourgari, Evgenia ;
Harriman, Karen N. ;
Kowalski, Aaron J. ;
Madden, Paul ;
McAuliffe-Fogarty, Alicia H. ;
McElwee-Malloy, Molly ;
Peters, Anne ;
Raman, Sripriya ;
Reifschneider, Kent ;
Rubin, Karen ;
Weinzimer, Stuart A. .
DIABETES CARE, 2017, 40 (12) :1622-1630
[2]  
[Anonymous], 2017, LoopDocs
[3]  
Atkinson MA, 2014, LANCET, V383, P69, DOI [10.1016/S0140-6736(13)60591-7, 10.1016/S0140-6736(18)31320-5]
[4]   Fast nonlinear model predictive control of a chemical reactor: a random shooting approach [J].
Bakarac, Peter ;
Kvasnica, Michal .
ACTA CHIMICA SLOVACA, 2018, 11 (02) :175-181
[5]   Review and Analysis of Blood Glucose (BG) Models for Type 1 Diabetic Patients [J].
Balakrishnan, Naviyn Prabhu ;
Rangaiah, Gade Pandu ;
Samavedham, Lakshminarayanan .
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2011, 50 (21) :12041-12066
[6]  
Barnard K. D., 2011, Psychology and Diabetes Care: A Practical Guide, P1
[7]   Clinical Targets for Continuous Glucose Monitoring Data Interpretation: Recommendations From the International Consensus on Time in Range [J].
Battelino, Tadej ;
Danne, Thomas ;
Bergenstal, Richard M. ;
Amiel, Stephanie A. ;
Beck, Roy ;
Biester, Torben ;
Bosi, Emanuele ;
Buckingham, Bruce A. ;
Cefalu, William T. ;
Close, Kelly L. ;
Cobelli, Claudio ;
Dassau, Eyal ;
DeVries, J. Hans ;
Donaghue, Kim C. ;
Dovc, Klemen ;
Doyle, Francis J. ;
Garg, Satish ;
Grunberger, George ;
Heller, Simon ;
Heinemann, Lutz ;
Hirsch, Irl B. ;
Hovorka, Roman ;
Jia, Weiping ;
Kordonouri, Olga ;
Kovatchev, Boris ;
Kowalski, Aaron ;
Laffel, Lori ;
Levine, Brian ;
Mayorov, Alexander ;
Mathieu, Chantal ;
Murphy, Helen R. ;
Nimri, Revital ;
Norgaard, Kirsten ;
Parkin, Christopher G. ;
Renard, Eric ;
Rodbard, David ;
Saboo, Banshi ;
Schatz, Desmond ;
Stoner, Keaton ;
Urakami, Tatsuiko ;
Weinzimer, Stuart A. ;
Phillip, Moshe .
DIABETES CARE, 2019, 42 (08) :1593-1603
[8]   Challenges and recent progress in the development of a closed-loop artificial pancreas [J].
Bequette, B. Wayne .
ANNUAL REVIEWS IN CONTROL, 2012, 36 (02) :255-266
[9]   EVALUATING CLINICAL ACCURACY OF SYSTEMS FOR SELF-MONITORING OF BLOOD-GLUCOSE [J].
CLARKE, WL ;
COX, D ;
GONDERFREDERICK, LA ;
CARTER, W ;
POHL, SL .
DIABETES CARE, 1987, 10 (05) :622-628
[10]   Artificial Pancreas Past, Present, Future [J].
Cobelli, Claudio ;
Renard, Eric ;
Kovatchev, Boris .
DIABETES, 2011, 60 (11) :2672-2682