Directed Clustering of Multivariate Data Based on Linear or Quadratic Latent Variable Models

被引:1
作者
Zhang, Yingjuan [1 ]
Einbeck, Jochen [1 ,2 ]
机构
[1] Univ Durham, Dept Math Sci, Durham DH1 3LE, England
[2] Univ Durham, Durham Res Methods Ctr, Durham DH1 3LE, England
关键词
clustering; mixture model; latent variable model; dimension reduction; expectation-maximization algorithm; model selection; fundamental diagram; MAXIMUM-LIKELIHOOD; DISCRIMINANT-ANALYSIS;
D O I
10.3390/a17080358
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider situations in which the clustering of some multivariate data is desired, which establishes an ordering of the clusters with respect to an underlying latent variable. As our motivating example for a situation where such a technique is desirable, we consider scatterplots of traffic flow and speed, where a pattern of consecutive clusters can be thought to be linked by a latent variable, which is interpretable as traffic density. We focus on latent structures of linear or quadratic shapes, and present an estimation methodology based on expectation-maximization, which estimates both the latent subspace and the clusters along it. The directed clustering approach is summarized in two algorithms and applied to the traffic example outlined. Connections to related methodology, including principal curves, are briefly drawn.
引用
收藏
页数:21
相关论文
共 36 条
[1]   Brain tumor segmentation based on a hybrid clustering technique [J].
Abdel-Maksoud, Eman ;
Elmogy, Mohammed ;
Al-Awadi, Rashid .
EGYPTIAN INFORMATICS JOURNAL, 2015, 16 (01) :71-81
[2]   A general maximum likelihood analysis of variance components in generalized linear models [J].
Aitkin, M .
BIOMETRICS, 1999, 55 (01) :117-128
[3]   STATISTICAL MODELING ISSUES IN SCHOOL EFFECTIVENESS STUDIES [J].
AITKIN, M ;
LONGFORD, N ;
PLEWIS, IF ;
WAKEFIELD, WB ;
CHATFIELD, C ;
GOLDSTEIN, H ;
REYNOLDS, D ;
COX, D ;
ECOB, R ;
GRAY, J ;
BELL, JF ;
BURSTEIN, L ;
DAWID, AP ;
HEALY, MJR ;
HUTCHISON, DA ;
KILGORE, S ;
PENDLETON, WW ;
LAIRD, NM ;
LOUIS, TA ;
PRAIS, SJ ;
RUTTER, M ;
MAUGHAN, B ;
OUSTON, J ;
SHARE, DL ;
SMITH, TMF .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1986, 149 :1-43
[4]  
Aitkin M., 1996, Proceedings of the 11th International Workshop on Statistical Modelling, P87
[5]  
[Anonymous], 2010, Statistical modelling in R
[6]  
2015, bioRxiv, DOI [10.1101/027219, 10.1101/027219, DOI 10.1101/027219]
[7]  
Cannoodt R, 2018, Princurve 2.0: Fit a Principal Curve in Arbitrary Dimension
[8]   Variable selection in model-based clustering and discriminant analysis with a regularization approach [J].
Celeux, Gilles ;
Maugis-Rabusseau, Cathy ;
Sedki, Mohammed .
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (01) :259-278
[9]   MEAN SHIFT, MODE SEEKING, AND CLUSTERING [J].
CHENG, YZ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1995, 17 (08) :790-799
[10]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38