Estimate of time series similarity based on models

被引:2
作者
Knignitskaya T.V. [1 ]
机构
[1] Yuriy Fedkovych Chernovtsy National University, Chernovtsy
关键词
Cluster; Clustering; Distance between time series by models; DTW; ERP; Time series; Time series model;
D O I
10.1615/JAutomatInfScien.v51.i8.60
中图分类号
学科分类号
摘要
Determining the measure as a distance between time series is a starting point for many data mining tasks such as clustering and classification. Clustering is a main method of teaching without a teacher, which is used to divide data into groups based on the internal and a priori unknown characteristics inherent in the data. When dividing data into clusters, the need arises to select the similarity metric between objects. The paper describes the main existing algorithms for the “distance” searching between time series, which describe well this problem for small time series and under the absence of outliers. Outliers inherent in real processes lead to improper clustering, and, consequently, to wrong decisions making. It is proposed to consider the distance between time series in the form of the distance between models (ARIMA) of these time series. In the presence of a large number of outliers, classical methods linearly increase the distances between time series, while the distance proposed in the article according to the models behaves as a logarithmic function. It is shown that with an increase in the number of measurements, the relative errors for all classical methods remain almost unchanged. At the same time, the relative error for estimating the distance by the models is much smaller and decreases with an increase in the number of measurements. The main achievement of the article is the determination of the distance between time series, based on the concept of a model, and the comparison of this distance with the corresponding classical methods most commonly used. Using the Monte Carlo method, it has been shown that the proposed distance is more resistant to outliers and gives more accurate results for time series with a large number of observations. In addition, the complexity of the algorithm for calculating distances based on models is less than the analogous computational complexity of existing algorithms (DTW, ERP, Euclidean distance). There is no doubt that the use of models is one of the most convenient tools for studying the similarity of processes. In addition, for analysis taking into account this algorithm, it is convenient to use the averaged evolutions and the limiting evolutions in the diffusion approximation scheme. Also, due to the resistance to outliers of limiting evolutions, the entered distance can be used in clustering to build more noise-resistant clusters. © 2019 by Begell House Inc.
引用
收藏
页码:70 / 80
页数:10
相关论文
共 50 条
  • [1] Evaluating time series similarity using concept-based models
    Jastrzebska, Agnieszka
    Napoles, Gonzalo
    Salgueiro, Yamisleydi
    Vanhoof, Koen
    KNOWLEDGE-BASED SYSTEMS, 2022, 238
  • [2] Cluster-Based Similarity Search in Time Series
    Karamitopoulos, Leonidas
    Evangelidis, Georgios
    PROCEEDINGS OF THE 2009 FOURTH BALKAN CONFERENCE IN INFORMATICS, 2009, : 113 - 118
  • [3] AN APPROACH FOR TIME SERIES SIMILARITY SEARCH BASED ON LUCENE
    Chang, Min
    Lou, Yuansheng
    Qiu, Lei
    PROCEEDINGS OF 2016 4TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (IEEE CCIS 2016), 2016, : 210 - 214
  • [4] An efficient similarity searching algorithm based on clustering for time series
    Feng, Yucai
    Jiang, Tao
    Zhou, Yingbiao
    Li, Junkui
    ADVANCES IN DATA MINING, PROCEEDINGS: MEDICAL APPLICATIONS, E-COMMERCE, MARKETING, AND THEORETICAL ASPECTS, 2008, 5077 : 360 - 373
  • [5] An improved similarity comparison method for long time series
    Yin, Hong
    Yang, Shuqiang
    Yin, Ping
    Jin, Songchang
    Chen, Zhikun
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 3462 - +
  • [6] A Similarity Model Based On Trend For Time Series
    Chen, ShuaiFei
    Lv, Xin
    Yu, Lin
    Mao, YingChi
    Wang, LongBao
    Ma, HongXu
    14TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS, ENGINEERING AND SCIENCE (DCABES 2015), 2015, : 435 - 438
  • [7] A New Time Series Similarity Measurement Method Based on Fluctuation Features
    Chen, Hailan
    Gao, Xuedong
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2020, 27 (04): : 1134 - 1141
  • [8] A method for measuring similarity of time series based on series decomposition and dynamic time warping
    Zhang, Qingzhen
    Zhang, Chaoqi
    Cui, Langfu
    Han, Xiaoxuan
    Jin, Yang
    Xiang, Gang
    Shi, Yan
    APPLIED INTELLIGENCE, 2023, 53 (06) : 6448 - 6463
  • [9] A method for measuring similarity of time series based on series decomposition and dynamic time warping
    Qingzhen Zhang
    Chaoqi Zhang
    Langfu Cui
    Xiaoxuan Han
    Yang Jin
    Gang Xiang
    Yan Shi
    Applied Intelligence, 2023, 53 : 6448 - 6463
  • [10] An energy-based similarity measure for time series
    Boudraa, Abdel-Ouahab
    Cexus, Jean-Christophe
    Groussat, Mathieu
    Brunagel, Pierre
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2008, 2008 (1)