tsflex: Flexible time series processing & feature extraction

被引:14
作者
Van der Donckt, Jonas [1 ]
Van der Donckt, Jeroen [1 ]
Deprost, Emiel [1 ]
Van Hoecke, Sofie [1 ]
机构
[1] Univ Ghent, IMEC, IDLab, Technol Pk Zwijnaarde 126, B-9052 Zwijnaarde, Belgium
基金
比利时弗兰德研究基金会;
关键词
Time series; Processing; Feature extraction; Machine learning; !text type='Python']Python[!/text;
D O I
10.1016/j.softx.2021.100971
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Time series processing and feature extraction are crucial and time-intensive steps in conventional machine learning pipelines. Existing packages are limited in their applicability, as they cannot cope with irregularly-sampled or asynchronous data and make strong assumptions about the data format. Moreover, these packages do not focus on execution speed and memory efficiency, resulting in considerable overhead. We present tsflex, a Python toolkit for time series processing and feature extraction, that focuses on performance and flexibility, enabling broad applicability. This toolkit leverages window-stride arguments of the same data type as the sequence-index, and maintains the sequence-index through all operations. tsflex is flexible as it supports (1) multivariate time series, (2) multiple window-stride configurations, and (3) integrates with processing and feature functions from other packages, while (4) making no assumptions about the data sampling regularity, series alignment, and data type. Other functionalities include multiprocessing, detailed execution logging, chunking sequences, and serialization. Benchmarks show that tsflex is faster and more memory-efficient compared to similar packages, while being more permissive and flexible in its utilization. (C) 2022 The Author(s). Published by Elsevier B.V.
引用
收藏
页数:6
相关论文
共 21 条
[1]  
[Anonymous], 2021, KATS ONE STOP SCHOP
[2]   TSFEL: Time Series Feature Extraction Library [J].
Barandas, Marilia ;
Folgado, Duarte ;
Fernandes, Leticia ;
Santos, Sara ;
Abreu, Mariana ;
Bota, Patricia ;
Liu, Hui ;
Schultz, Tanja ;
Gamboa, Hugo .
SOFTWAREX, 2020, 11
[3]  
Brouwer MD, 2021, BMC MED INFORM DECIS
[4]  
Buitinck L., 2013, ARXIV13090238
[5]  
Burns DM, 2018, J MACH LEARN RES, V19
[6]  
Christ M., 2020, AWESOME TIME SERIES
[7]   Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh - A Python']Python package) [J].
Christ, Maximilian ;
Braun, Nils ;
Neuffer, Julius ;
Kempa-Liehr, Andreas W. .
NEUROCOMPUTING, 2018, 307 :72-77
[8]   Anomaly Detection for IoT Time-Series Data: A Survey [J].
Cook, Andrew A. ;
Misirli, Goksel ;
Fan, Zhong .
IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (07) :6481-6494
[9]   A Few Useful Things to Know About Machine Learning [J].
Domingos, Pedro .
COMMUNICATIONS OF THE ACM, 2012, 55 (10) :78-87
[10]  
Gao T, 2020, VIZTRACER LOW OVERHE