End-to-end deep representation learning for time series clustering: a comparative study

被引:0
作者
Baptiste Lafabregue
Jonathan Weber
Pierre Gançarski
Germain Forestier
机构
[1] Université de Haute Alsace,IRIMAS
[2] Université de Strasbourg,ICube
[3] Monash University,Faculty of IT
来源
Data Mining and Knowledge Discovery | 2022年 / 36卷
关键词
Clustering; Deep learning; Time series;
D O I
暂无
中图分类号
学科分类号
摘要
Time series are ubiquitous in data mining applications. Similar to other types of data, annotations can be challenging to acquire, thus preventing from training time series classification models. In this context, clustering methods can be an appropriate alternative as they create homogeneous groups allowing a better analysis of the data structure. Time series clustering has been investigated for many years and multiple approaches have already been proposed. Following the advent of deep learning in computer vision, researchers recently started to study the use of deep clustering to cluster time series data. The existing approaches mostly rely on representation learning (imported from computer vision), which consists of learning a representation of the data and performing the clustering task using this new representation. The goal of this paper is to provide a careful study and an experimental comparison of the existing literature on time series representation learning for deep clustering. In this paper, we went beyond the sole comparison of existing approaches and proposed to decompose deep clustering methods into three main components: (1) network architecture, (2) pretext loss, and (3) clustering loss. We evaluated all combinations of these components (totaling 300 different models) with the objective to study their relative influence on the clustering performance. We also experimentally compared the most efficient combinations we identified with existing non-deep clustering methods. Experiments were performed using the largest repository of time series datasets (the UCR/UEA archive) composed of 128 univariate and 30 multivariate datasets. Finally, we proposed an extension of the class activation maps method to the unsupervised case which allows to identify patterns providing highlights on how the network clustered the time series.
引用
收藏
页码:29 / 81
页数:52
相关论文
共 91 条
[1]  
Aghabozorgi S(2015)Time-series clustering—a decade review Inf Syst 53 16-38
[2]  
Shirkhorshidi AS(1994)Learning long-term dependencies with gradient descent is difficult IEEE Trans Neural Netw 5 157-166
[3]  
Wah TY(2020)Structural deep clustering network Proc Web Conf 2020 1400-1410
[4]  
Bengio Y(2019)The UCR time series archive IEEE/CAA J Autom Sin 6 1293-1305
[5]  
Simard P(2019)Deep learning for time series classification: a review Data Min Knowl Disc 33 917-963
[6]  
Frasconi P(2000)Learning to forget: continual prediction with LSTM Neural Comput 12 2451-2471
[7]  
Bo D(1997)Long short-term memory Neural Comput 9 1735-1780
[8]  
Wang X(1982)Neural networks and physical systems with emergent collective computational abilities Proc Natl Acad Sci 79 2554-2558
[9]  
Shi C(2010)Data clustering: 50 years beyond k-means Pattern Recogn Lett 31 651-666
[10]  
Zhu M(2007)Supervised machine learning: a review of classification techniques Emerg Artif Intell Appl Comput Eng 160 3-24