Robust clustering for functional data based on trimming and constraints

被引:15
作者
Rivera-Garcia, Diego [1 ]
Garcia-Escudero, Luis A. [2 ]
Mayo-Iscar, Agustin [2 ]
Ortega, Joaquin [1 ]
机构
[1] CIMAT, AC Jalisco S-N, Guanajuato 36240, Mexico
[2] Univ Valladolid, Dept Estadist & Invest Operativa, Paseo de Belen 7, E-47005 Valladolid, Spain
关键词
Functional data analysis; Clustering; Robustness; Trimming; Functional principal components analysis;
D O I
10.1007/s11634-018-0312-7
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Many clustering algorithms when the data are curves or functions have been recently proposed. However, the presence of contamination in the sample of curves can influence the performance of most of them. In this work we propose a robust, model-based clustering method that relies on an approximation to the density function for functional data. The robustness follows from the joint application of data-driven trimming, for reducing the effect of contaminated observations, and constraints on the variances, for avoiding spurious clusters in the solution. The algorithm is designed to perform clustering and outlier detection simultaneously by maximizing a trimmed pseudo likelihood. The proposed method has been evaluated and compared with other existing methods through a simulation study. Better performance for the proposed methodology is shown when a fraction of contaminating curves is added to a non-contaminated sample. Finally, an application to a real data set that has been previously considered in the literature is given.
引用
收藏
页码:201 / 225
页数:25
相关论文
共 26 条
  • [1] Bouveyron C., 2014, FUNHDDC MODEL BASED
  • [2] Model-based clustering of time series in group-specific functional subspaces
    Bouveyron, Charles
    Jacques, Julien
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2011, 5 (04) : 281 - 300
  • [3] SCREE TEST FOR NUMBER OF FACTORS
    CATTELL, RB
    [J]. MULTIVARIATE BEHAVIORAL RESEARCH, 1966, 1 (02) : 245 - 276
  • [4] Cerioli A, 2017, J COMPUT GRAPH STAT
  • [5] Cuesta-Albertos JA, 1997, ANN STAT, V25, P553
  • [6] Impartial trimmed k-means for functional data
    Cuesta-Albertos, Juan Antonio
    Fraiman, Ricardo
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 51 (10) : 4864 - 4877
  • [7] DELAIGLE A, 1983, AOS, V38, P1171, DOI DOI 10.1214/09-AOS741
  • [8] Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels
    Febrero, Manuel
    Galeano, Pedro
    Gonzalez-Manteiga, Wenceslao
    [J]. ENVIRONMETRICS, 2008, 19 (04) : 331 - 345
  • [9] Febrero-Bande M, 2012, J STAT SOFTW, V51, P1
  • [10] Ferraty F., 2006, SPR S STAT