Efficient Observation Time Window Segmentation for Administrative Data Machine Learning

被引:0
作者
Taib, Musa [1 ]
Messier, Geoffrey G. [1 ]
机构
[1] Univ Calgary, Dept Elect & Software Engn, Calgary T2N 1N4, AB, Canada
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Machine learning; Data models; Market research; Vectors; Vehicle dynamics; Tuning; Time series analysis; Recurrent neural networks; Physiology; Null value; Administrative data; time window segmentation; machine learning; hospital; homelessness;
D O I
10.1109/ACCESS.2024.3484270
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning models benefit when allowed to learn from temporal trends in time-stamped administrative data. These trends can be represented by dividing a model's observation window into time segments or bins. Model training time and performance can be improved by representing each feature with a different time resolution. However, this causes the time bin size hyperparameter search space to grow exponentially with the number of features. This paper proposes a computationally efficient time series analysis to investigate binning (TAIB) technique that determines which subset of data features benefit the most from time bin size hyperparameter tuning. This technique is demonstrated using hospital and housing/homelessness administrative data sets. The results show that TAIB leads to models that are not only more efficient to train but can perform better than models that default to representing all features with the same time bin size.
引用
收藏
页码:158647 / 158656
页数:10
相关论文
共 41 条
  • [1] Multi-Time Resolution Ensemble LSTMs for Enhanced Feature Extraction in High-Rate Time Series
    Barzegar, Vahid
    Laflamme, Simon
    Hu, Chao
    Dodson, Jacob
    [J]. SENSORS, 2021, 21 (06) : 1 - 18
  • [2] Calin O, 2020, SPRINGER SER DATA SC, P1, DOI 10.1007/978-3-030-36721-3
  • [3] Recurrent Neural Networks for Multivariate Time Series with Missing Values
    Che, Zhengping
    Purushotham, Sanjay
    Cho, Kyunghyun
    Sontag, David
    Liu, Yan
    [J]. SCIENTIFIC REPORTS, 2018, 8
  • [4] Cho K, 2014, P WORKSH SYNT SEM ST, DOI [10.3115/v1/w14-4012, 10.3115/v1/W14-4012]
  • [5] Cox D. R., 1958, Biometrika, V45, P488
  • [6] Intensive Care Unit Mortality Prediction: An Improved Patient-Specific Stacking Ensemble Model
    El-Rashidy, Nora
    El-Sappagh, Shaker
    Abuhmed, Tamer
    Abdelrazek, Samir
    El-Bakry, Hazem M.
    [J]. IEEE ACCESS, 2020, 8 : 133541 - 133564
  • [7] Figlio D, 2016, HBK ECON, V5, P75, DOI 10.1016/B978-0-444-63459-7.00002-6
  • [8] Glorot X., 2011, JMLR Workshop and Conference Proceedings, P315
  • [9] Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
  • [10] Groll A, 2018, Arxiv, DOI arXiv:1806.03208