A benchmark study on time series clustering

被引:87
作者
Javed, Ali [1 ,3 ,4 ]
Lee, Byung Suk [1 ,3 ]
Rizzo, Donna M. [2 ,3 ]
机构
[1] Univ Vermont, Dept Comp Sci, Burlington, VT USA
[2] Univ Vermont, Dept Civil & Environm Engn, Burlington, VT USA
[3] Univ Vermont, Gund Inst Environm, Burlington, VT USA
[4] 82 Univ Pl,Innovat Hall, Burlington, VT 05405 USA
来源
MACHINE LEARNING WITH APPLICATIONS | 2020年 / 1卷
基金
美国国家科学基金会;
关键词
Time series; Clustering; Benchmark; UCR archive; CLASSIFICATION; TURBIDITY; DENSITY; FIND;
D O I
10.1016/j.mlwa.2020.100001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents the first time series clustering benchmark utilizing all time series datasets currently available in the University of California Riverside (UCR) archive - the state of the art repository of time series data. Specifically, the benchmark examines eight popular clustering methods representing three categories of clustering algorithms (partitional, hierarchical and density -based) and three types of distance measures (Euclidean, dynamic time warping, and shape -based), while adhering to six restrictions on datasets and methods to make the comparison as unbiased as possible. A phased evaluation approach was then designed for summarizing dataset-level assessment metrics and discussing the results. The benchmark study presented can be a useful reference for the research community on its own; and the dataset-level assessment metrics reported may be used for designing evaluation frameworks to answer different research questions.
引用
收藏
页数:13
相关论文
共 52 条
[1]   Time-series clustering - A decade review [J].
Aghabozorgi, Saeed ;
Shirkhorshidi, Ali Seyed ;
Teh Ying Wah .
INFORMATION SYSTEMS, 2015, 53 :16-38
[2]   Clustering and Classification for Time Series Data in Visual Analytics: A Survey [J].
Ali, Mohammed ;
Alqahtani, Ali ;
Jones, Mark W. ;
Xie, Xianghua .
IEEE ACCESS, 2019, 7 :181314-181338
[3]  
Dau HA, 2019, Arxiv, DOI arXiv:1810.07758
[4]   Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy [J].
Begum, Nurjahan ;
Ulanova, Liudmila ;
Wang, Jun ;
Keogh, Eamonn .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :49-58
[5]   High-frequency nutrient monitoring to infer seasonal patterns in catchment source availability, mobilisation and delivery [J].
Bende-Michl, Ulrike ;
Verburg, Kirsten ;
Cresswell, Hamish P. .
ENVIRONMENTAL MONITORING AND ASSESSMENT, 2013, 185 (11) :9191-9219
[6]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[7]   Some new indexes of cluster validity [J].
Bezdek, JC ;
Pal, NR .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1998, 28 (03) :301-315
[8]  
Bholowalia P., 2014, INT J COMPUT APPL, V105, P17, DOI DOI 10.5120/18405-9674
[9]  
Calinski T., 1974, Communications in Statistics-theory and Methods, V3, P1, DOI [DOI 10.1080/03610927408827101, 10.1080/03610927408827101, https://doi.org/10.1080/03610927408827101]
[10]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227