Simulation Study on How Input Data Affects Time-Series Classification Model Results

被引:0
作者
Sadowska, Maria [1 ]
Gajowniczek, Krzysztof [1 ]
机构
[1] Warsaw Univ Life Sci SGGW, Inst Informat Technol, PL-02787 Warsaw, Poland
关键词
time series; classification; synthetic data;
D O I
10.3390/e27060624
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
This paper discusses the results of a study investigating how input data characteristics affect the performance of time-series classification models. In this experiment, we used 82 synthetically generated time-series datasets, created based on predefined functions with added noise. These datasets varied in structure, including differences in the number of classes and noise levels, while maintaining a consistent length and total number of observations. This design allowed us to systematically assess the influence of dataset characteristics on classification outcomes. Seven classification models were evaluated and their performance was compared using accuracy metrics, training time and memory requirements. According to the evaluation, the CNN Classifier achieved the best results, demonstrating the highest robustness to an increasing number of classes and noise. In contrast, the least effective model was the Catch22 Classifier. Overall, the performed research leads to the conclusion that as the number of classes and the level of noise in the data increase, all classification models become less effective, achieving lower accuracy metrics.
引用
收藏
页数:20
相关论文
共 36 条
[1]  
Caruana R., 2004, P 10 ACM SIGKDD INT, P69, DOI [DOI 10.1073/pnas.0901650106, DOI 10.1145/1014052.1014063]
[2]  
Chen JF, 2016, INT C CLOUD COMP BIG, P87, DOI [10.1109/CCBD.2016.027, 10.1109/CCBD.2016.51]
[3]   Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh - A Python']Python package) [J].
Christ, Maximilian ;
Braun, Nils ;
Neuffer, Julius ;
Kempa-Liehr, Andreas W. .
NEUROCOMPUTING, 2018, 307 :72-77
[4]  
Coletta A, 2023, Arxiv, DOI arXiv:2307.01717
[5]  
Das A, 2024, Arxiv, DOI arXiv:2310.10688
[6]   ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels [J].
Dempster, Angus ;
Petitjean, Francois ;
Webb, Geoffrey, I .
DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (05) :1454-1495
[7]   A time series forest for classification and feature extraction [J].
Deng, Houtao ;
Runger, George ;
Tuv, Eugene ;
Vladimir, Martyanov .
INFORMATION SCIENCES, 2013, 239 :142-153
[8]   Back to Basics: A Sanity Check on Modern Time Series Classification Algorithms [J].
Dhariyal, Bhaskar ;
Le Nguyen, Thach ;
Ifrim, Georgiana .
ADVANCED ANALYTICS AND LEARNING ON TEMPORAL DATA, AALTD 2023, 2023, 14343 :205-229
[9]  
Dissanayake O, 2024, Arxiv, DOI [arXiv:2404.18159, 10.2139/ssrn.4937129, DOI 10.2139/SSRN.4937129]
[10]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874