The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances

被引:909
作者
Bagnall, Anthony [1 ]
Lines, Jason [1 ]
Bostrom, Aaron [1 ]
Large, James [1 ]
Keogh, Eamonn [2 ]
机构
[1] Univ East Anglia, Sch Comp Sci, Norwich, Norfolk, England
[2] Univ Calif Riverside, Comp Sci & Engn Dept, Riverside, CA 92521 USA
基金
英国工程与自然科学研究理事会;
关键词
Time series classification; Shapelets; Elastic distance measures; Time series similarity; STATISTICAL COMPARISONS; REPRESENTATION; TRANSFORMATION; CLASSIFIERS; SIMILARITY; DISTANCE; FEATURES;
D O I
10.1007/s10618-016-0483-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last 5 years there have been a large number of new time series classification algorithms proposed in the literature. These algorithms have been evaluated on subsets of the 47 data sets in the University of California, Riverside time series classification archive. The archive has recently been expanded to 85 data sets, over half of which have been donated by researchers at the University of East Anglia. Aspects of previous evaluations have made comparisons between algorithms difficult. For example, several different programming languages have been used, experiments involved a single train/test split and some used normalised data whilst others did not. The relaunch of the archive provides a timely opportunity to thoroughly evaluate algorithms on a larger number of datasets. We have implemented 18 recently proposed algorithms in a common Java framework and compared them against two standard benchmark classifiers (and each other) by performing 100 resampling experiments on each of the 85 datasets. We use these results to test several hypotheses relating to whether the algorithms are significantly more accurate than the benchmarks and each other. Our results indicate that only nine of these algorithms are significantly more accurate than both benchmarks and that one classifier, the collective of transformation ensembles, is significantly more accurate than all of the others. All of our experiments and results are reproducible: we release all of our code, results and experimental details and we hope these experiments form the basis for more robust testing of new algorithms in the future.
引用
收藏
页码:606 / 660
页数:55
相关论文
共 49 条
[1]  
Bagnall A, UCR UEA TSC ARCH
[2]  
Bagnall A, THE UEA TSC CODEBASE
[3]   Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles [J].
Bagnall, Anthony ;
Lines, Jason ;
Hills, Jon ;
Bostrom, Aaron .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (09) :2522-2535
[4]   A Run Length Transformation for Discriminating Between Auto Regressive Time Series [J].
Bagnall, Anthony ;
Janacek, Gareth .
JOURNAL OF CLASSIFICATION, 2014, 31 (02) :154-178
[5]   CID: an efficient complexity-invariant distance for time series [J].
Batista, Gustavo E. A. P. A. ;
Keogh, Eamonn J. ;
Tataw, Oben Moses ;
de Souza, Vinicius M. A. .
DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (03) :634-669
[6]   Time series representation and similarity based on local autopatterns [J].
Baydogan, Mustafa Gokce ;
Runger, George .
DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (02) :476-509
[7]   A Bag-of-Features Framework to Classify Time Series [J].
Baydogan, Mustafa Gokce ;
Runger, George ;
Tuv, Eugene .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (11) :2796-2802
[8]  
Benavoli A, 2016, J MACH LEARN RES, V17
[9]   Binary Shapelet Transform for Multiclass Time Series Classification [J].
Bostrom, Aaron ;
Bagnall, Anthony .
BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, 2015, 9263 :257-269
[10]   Locally adaptive dimensionality reduction for indexing large time series databases [J].
Chakrabarti, K ;
Keogh, E ;
Mehrotra, S ;
Pazzani, M .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2002, 27 (02) :188-228