Quantifying input data drift in medical machine learning models by detecting change-points in time-series data

被引:0
|
作者
Prathapan, Smriti [1 ]
Samala, Ravi K. [1 ]
Hadjiyski, Nathan [1 ]
D'Haese, Pierre-Francois [2 ]
Maldonado, Fabien [3 ]
Phuong Nguyen [4 ]
Yesha, Yelena [4 ,5 ]
Sahiner, Berkman [1 ]
机构
[1] US FDA, Ctr Devices & Radiol Hlth, Off Sci & Engn Labs, Silver Spring, MD 20993 USA
[2] West Virginia Univ, Rockefeller Neurosci Inst, Morgantown, WV 26506 USA
[3] Vanderbilt Univ, Med Ctr, Nashville, TN USA
[4] Univ Miami, Dept Comp Sci, Coral Gables, FL 33124 USA
[5] Univ Miami, Dept Radiol, Coral Gables, FL 33124 USA
来源
COMPUTER-AIDED DIAGNOSIS, MEDICAL IMAGING 2024 | 2024年 / 12927卷
关键词
Medical Imaging; Mammography; Drift Detection; CUSUM; Clinical AI workflow; Average Run Length; Quality assurance;
D O I
10.1117/12.3008771
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Devices enabled by artificial intelligence (AI) and machine learning (ML) are being introduced for clinical use at an accelerating pace. In a dynamic clinical environment, these devices may encounter conditions different from those they were developed for. The statistical data mismatch between training/initial testing and production is often referred to as data drift. Detecting and quantifying data drift is significant for ensuring that AI model performs as expected in clinical environments. A drift detector signals when a corrective action is needed if the performance changes. In this study, we investigate how a change in the performance of an AI model due to data drift can be detected and quantified using a cumulative sum (CUSUM) control chart. To study the properties of CUSUM, we first simulate different scenarios that change the performance of an AI model. We simulate a sudden change in the mean of the performance metric at a change-point (change day) in time. The task is to quickly detect the change while providing few false-alarms before the change-point, which may be caused by the statistical variation of the performance metric over time. Subsequently, we simulate data drift by denoising the Emory Breast Imaging Dataset (EMBED) after a pre-defined change-point. We detect the change-point by studying the pre- and post-change specificity of a mammographic CAD algorithm. Our results indicate that with the appropriate choice of parameters, CUSUM is able to quickly detect relatively small drifts with a small number of false-positive alarms.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Detecting Variance Change-Points for Blocked Time Series and Dependent Panel Data
    Xu, Minya
    Zhong, Ping-Shou
    Wang, Wei
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2016, 34 (02) : 213 - 226
  • [2] TESTING AND ESTIMATING CHANGE-POINTS IN TIME-SERIES
    PICARD, D
    ADVANCES IN APPLIED PROBABILITY, 1985, 17 (04) : 841 - 867
  • [3] DETECTING CHANGE-POINTS AND MONITORING BIOMEDICAL DATA
    ZHANG, HP
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1995, 24 (05) : 1307 - 1324
  • [4] Comparing Machine Learning Algorithms for Medical Time-Series Data
    Helmersson, Alex
    Hoti, Faton
    Levander, Sebastian
    Shereef, Aliasgar
    Svensson, Emil
    El-Merhi, Ali
    Vithal, Richard
    Liljencrantz, Jaquette
    Block, Linda
    Herges, Helena Odenstedt
    Staron, Miroslaw
    PRODUCT-FOCUSED SOFTWARE PROCESS IMPROVEMENT, PROFES 2023, PT I, 2024, 14483 : 200 - 207
  • [5] Dynamic selection of machine learning models for time-series data
    Hananya, Rotem
    Katz, Gilad
    INFORMATION SCIENCES, 2024, 665
  • [6] DETECTING POINTS OF CHANGE IN TIME-SERIES
    SASTRI, T
    FLORES, B
    VALDES, J
    COMPUTERS & OPERATIONS RESEARCH, 1989, 16 (03) : 271 - 293
  • [7] Extreme learning machine based mutual information estimation with application to time-series change-points detection
    Oh, Beom-Seok
    Sun, Lei
    Ahn, Chung Soo
    Yeo, Yong Kiang
    Yang, Yan
    Liu, Nan
    Lin, Zhiping
    NEUROCOMPUTING, 2017, 261 : 204 - 216
  • [8] Diagnostic Expert Advisor: A platform for developing machine learning models on medical time-series data
    Polzin, Richard
    Fritsch, Sebastian
    Sharafutdinov, Konstantin
    Marx, Gernot
    Schuppert, Andreas
    SOFTWAREX, 2023, 23
  • [9] Topological Data Analysis of Time-Series as an Input Embedding for Deep Learning Models
    Byers, Morgan
    Hinkle, Lee B.
    Metsis, Vangelis
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2022, PART II, 2022, 647 : 402 - 413
  • [10] ESTIMATION OF CHANGE-POINTS IN LINEAR AND NONLINEAR TIME SERIES MODELS
    Ling, Shiqing
    ECONOMETRIC THEORY, 2016, 32 (02) : 402 - 430