Approximate Clustering of Time-Series Datasets using k-Modes Partitioning

被引:0
作者
Aghabozorgi, Saeed [1 ]
Teh Ying Wah [1 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Dept Informat Syst, Kuala Lumpur 50603, Malaysia
关键词
data mining; clustering; time series; approximation; distance measure; dimensionality reduction; DIMENSIONALITY REDUCTION; SIMILARITY SEARCH; REPRESENTATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data in various systems, such as those in finance, healthcare, and business, are stored as time series. As such, interest in time series mining in these areas has surged. Clustering of data is performed as a pre-processing or exploratory approach in many data mining tasks. Time series data sets are often very large, thus, data cannot fit in the main memory for clustering. In this case, dimension reduction is a common solution. However, the cost of data reduction is relatively high because of overlooking the data involved in this process, leading to low-quality clustering. In this paper, we propose a new approach for improving the approximate clustering accuracy of dimensionality reduced time series by discretization approach. A new distance measure is initially introduced. Thereafter, the partitional algorithm that best matches the representation method is proposed.
引用
收藏
页码:207 / 228
页数:22
相关论文
共 57 条
  • [1] Aghabozorgi Saeed R., 2011, Proceedings of the 2011 International Conference on Data Mining (DMIN 2011), P214
  • [2] Clustering of large time series datasets
    Aghabozorgi, Saeed
    Teh, Ying Wah
    [J]. INTELLIGENT DATA ANALYSIS, 2014, 18 (05) : 793 - 817
  • [3] Stock market co-movement assessment using a three-phase clustering method
    Aghabozorgi, Saeed
    Teh, Ying Wah
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (04) : 1301 - 1314
  • [4] Aghabozorgi S, 2012, J INF SCI ENG, V28, P671
  • [5] Alon J, 2003, PROC CVPR IEEE, P375
  • [6] A comparison of extrinsic clustering evaluation metrics based on formal constraints
    Amigo, Enrique
    Gonzalo, Julio
    Artiles, Javier
    Verdejo, Felisa
    [J]. INFORMATION RETRIEVAL, 2009, 12 (04): : 461 - 486
  • [7] Andreopoulos B., 2005, WSEAS Transactions on Information Science and Applications, V2, P1625
  • [8] A roadmap of clustering algorithms: finding a match for a biomedical application
    Andreopoulos, Bill
    An, Aijun
    Wang, Xiaogang
    Schroeder, Michael
    [J]. BRIEFINGS IN BIOINFORMATICS, 2009, 10 (03) : 297 - 314
  • [9] [Anonymous], P 22 INT C DAT ENG W
  • [10] [Anonymous], 2003, DMKD, DOI DOI 10.1145/882082.882086