The Distance Between: An Algorithmic Approach to Comparing Stochastic Models to Time-Series Data

被引:0
|
作者
Sherlock, Brock D. [1 ,2 ]
Boon, Marko A. A. [2 ]
Vlasiou, Maria [3 ]
Coster, Adelle C. F. [1 ]
机构
[1] Univ New South Wales, Sch Math & Stat, Sydney, NSW 2052, Australia
[2] Eindhoven Univ Technol, Dept Math & Comp Sci, POB 513, NL-5600 MB Eindhoven, Netherlands
[3] Univ Twente, Fac Elect Engn Math & Comp Sci, POB 217, NL-7500 AE Enschede, Netherlands
基金
澳大利亚研究理事会;
关键词
Distance metrics; Time-series data; Multiple experiments; Distance between evolving distributions; APPROXIMATE BAYESIAN COMPUTATION; KOLMOGOROV-SMIRNOV; RECYCLING PATHWAY; GLUT4; INSULIN; PROTEIN; DISTRIBUTIONS; COMPARTMENTS; TRAFFICKING; SIMULATION;
D O I
10.1007/s11538-024-01331-y
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
While mean-field models of cellular operations have identified dominant processes at the macroscopic scale, stochastic models may provide further insight into mechanisms at the molecular scale. In order to identify plausible stochastic models, quantitative comparisons between the models and the experimental data are required. The data for these systems have small sample sizes and time-evolving distributions. The aim of this study is to identify appropriate distance metrics for the quantitative comparison of stochastic model outputs and time-evolving stochastic measurements of a system. We identify distance metrics with features suitable for driving parameter inference, model comparison, and model validation, constrained by data from multiple experimental protocols. In this study, stochastic model outputs are compared to synthetic data across three scales: that of the data at the points the system is sampled during the time course of each type of experiment; a combined distance across the time course of each experiment; and a combined distance across all the experiments. Two broad categories of comparators at each point were considered, based on the empirical cumulative distribution function (ECDF) of the data and of the model outputs: discrete based measures such as the Kolmogorov-Smirnov distance, and integrated measures such as the Wasserstein-1 distance between the ECDFs. It was found that the discrete based measures were highly sensitive to parameter changes near the synthetic data parameters, but were largely insensitive otherwise, whereas the integrated distances had smoother transitions as the parameters approached the true values. The integrated measures were also found to be robust to noise added to the synthetic data, replicating experimental error. The characteristics of the identified distances provides the basis for the design of an algorithm suitable for fitting stochastic models to real world stochastic data.
引用
收藏
页数:41
相关论文
共 50 条
  • [1] TIME-SERIES AND STOCHASTIC-MODELS
    HANNAN, EJ
    LECTURE NOTES IN CONTROL AND INFORMATION SCIENCES, 1986, 86 : 1 - 36
  • [2] STOCHASTIC VERSIONS OF CHAOTIC TIME-SERIES - GENERALIZED LOGISTIC AND HENON TIME-SERIES MODELS
    GERR, NL
    ALLEN, JC
    PHYSICA D, 1993, 68 (02): : 232 - 249
  • [3] STOCHASTIC TIME-SERIES REPRESENTATION OF WAVE DATA
    SCHEFFNER, NW
    BORGMAN, LE
    JOURNAL OF WATERWAY PORT COASTAL AND OCEAN ENGINEERING-ASCE, 1992, 118 (04): : 337 - 351
  • [4] STOCHASTIC RELATIONSHIPS BETWEEN NATURAL TIME-SERIES
    MCLEOD, AI
    HIPEL, KW
    TRANSACTIONS-AMERICAN GEOPHYSICAL UNION, 1977, 58 (12): : 1136 - 1136
  • [5] A BRIDGE BETWEEN NONLINEAR TIME-SERIES MODELS AND NONLINEAR STOCHASTIC DYNAMIC-SYSTEMS - A LOCAL LINEARIZATION APPROACH
    OZAKI, T
    STATISTICA SINICA, 1992, 2 (01) : 113 - 135
  • [6] Stochastic models: An algorithmic approach
    Atkinson, JB
    INTERFACES, 1998, 28 (06) : 75 - 77
  • [7] Detecting frequency modulation in stochastic time-series data
    Hauber, Adrian L.
    Sigloch, Christian
    Timmer, Jens
    PHYSICAL REVIEW E, 2022, 106 (02)
  • [8] Modeling Stochastic Variability in Multiband Time-series Data
    Hu, Zhirui
    Tak, Hyungsuk
    ASTRONOMICAL JOURNAL, 2020, 160 (06):
  • [9] Time-Series Models for Border Inspection Data
    Decrouez, Geoffrey
    Robinson, Andrew
    RISK ANALYSIS, 2013, 33 (12) : 2142 - 2153
  • [10] AN APPROACH TO TESTING LINEAR TIME-SERIES MODELS
    POSKITT, DS
    TREMAYNE, AR
    ANNALS OF STATISTICS, 1981, 9 (05): : 974 - 986