Robust fuzzy clustering based on quantile autocovariances

被引:0
作者
B. Lafuente-Rego
P. D’Urso
J. A. Vilar
机构
[1] University of A Coruña,Research Group on Modeling, Optimization and Statistical Inference (MODES), Department of Mathematics, Computer Science Faculty
[2] Sapienza University of Rome,Dipartimento di Scienze Sociali ed Economiche
来源
Statistical Papers | 2020年 / 61卷
关键词
Time series data; Robust fuzzy ; -medoids clustering; Quantile autocovariances; Exponential distance; Noise cluster; Trimming;
D O I
暂无
中图分类号
学科分类号
摘要
Robustness to the presence of outliers in time series clustering is addressed. Assuming that the clustering principle is to group realizations of series generated from similar dependence structures, three robust versions of a fuzzy C-medoids model based on comparing sample quantile autocovariances are proposed by considering, respectively, the so-called metric, noise, and trimmed approaches. Each method achieves its robustness against outliers in different manner. The metric approach considers a suitable transformation of the distance aimed at smoothing the effect of the outliers, the noise approach brings together the outliers into a separated artificial cluster, and the trimmed approach removes a fraction of the time series. All the proposed approaches take advantage of the high capability of the quantile autocovariances to discriminate between independent realizations from a broad range of stationary processes, including linear, non-linear and conditional heteroskedastic models. An extensive simulation study involving scenarios with different generating models and contaminated with outliers is performed. Robustness against (i) outliers generated from different generating patterns, and (ii) outliers characterized by isolated, temporary or persistent level changes is evaluated. The influence of the input parameters required by the different algorithms is analyzed. Regardless of the considered models, the results show that the proposed robust procedures are able to neutralize the effect of the anomalous series preserving the true clustering structure, and fairly outperform other robust algorithms based on alternative metrics. Two applications to financial data sets permit to illustrate the usefulness of the proposed models.
引用
收藏
页码:2393 / 2448
页数:55
相关论文
共 195 条
  • [61] Durante F(2015)Clustering seasonal time series using extreme value analysis: an application to spanish temperature time series Commun Stat 1 175-191
  • [62] Pappadà R(1985)Application of fuzzy sets to climatic classification Agric For Meteorol 35 165-185
  • [63] Torelli N(2014)TSclust: An R package for time series clustering J Stat Softw 62 1-43
  • [64] D’Urso P(2008)Clustering heteroskedastic time series by model-based procedures Comput Stat Data Anal 52 4685-4698
  • [65] D’Urso P(2010)Identifying financial time series with similar dynamic conditional correlation Comput Stat Data Anal 54 1-15
  • [66] D’Urso P(2011)Outliers, influential observations, and missing data Wiley, New York, chap 6 136-170
  • [67] De Giovanni L(2011)An empirical study of classification algorithm evaluation for financial risk prediction Appl Soft Comput 11 2906-2915
  • [68] D’Urso P(2010)Comparing several parametric and nonparametric approaches to time series clustering: a simulation study J Classif 27 333-362
  • [69] De Giovanni L(1981)On the first-order bilinear time series model J Appl Probab 18 617-627
  • [70] D’Urso P(1990)A distance measure for classifying arima models J Time Ser Anal 11 153-164