Distance-based Clustering of Functional Data with Derivative Principal Component Analysis

被引:0
作者
Yu, Ping [1 ]
Shi, Gongming [2 ]
Wang, Chunjie [3 ]
Song, Xinyuan [4 ]
机构
[1] Shanxi Normal Univ, Sch Math & Comp Sci, Taiyuan, Peoples R China
[2] Capital Univ Econ & Business, Sch Stat, Beijing, Peoples R China
[3] Changchun Univ Technol, Sch Math & Stat, Changchun, Peoples R China
[4] Chinese Univ Hong Kong, Dept Stat, Shatin, Hong Kong, Peoples R China
关键词
Clustering; Curve derivatives; Functional principal component analysis; Identifiability; Projection;
D O I
10.1080/10618600.2024.2366499
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Functional data analysis (FDA) is an important modern paradigm for handling infinite-dimensional data. An important task in FDA is clustering, which identifies subgroups based on the shapes of measured curves. Considering that derivatives can provide additional useful information about the shapes of functionals, we propose a novel L2 distance between two random functions by incorporating the functions and their derivative information to determine the dissimilarity of curves under a unified scheme for dense observations. The Karhunen-Lo & egrave;ve expansion is used to approximate the curves and their derivatives. Cluster membership prediction for each curve intends to minimize the new distances between the observed and predicted curves through subspace projection among all possible clusters. We provide consistent estimators for the curves, curve derivatives, and the proposed distance. Identifiability issues of the clustering procedure are also discussed. The utility of the proposed method is illustrated via simulation studies and applications to two real datasets. The proposed method can considerably improve cluster performance compared with existing functional clustering methods. Supplementary materials for the article are available online.
引用
收藏
页码:47 / 58
页数:12
相关论文
共 32 条