Simultaneous dimension reduction and clustering via the NMF-EM algorithm

被引:0
作者
Léna Carel
Pierre Alquier
机构
[1] Expedia Group,
[2] RIKEN Center for Advanced Intelligence Project,undefined
来源
Advances in Data Analysis and Classification | 2021年 / 15卷
关键词
Mixture models; Ticketing data; Matrix factorization; Reduction of dimension; EM algorithm; Clustering; Hidden variables; Primary 62H30; Secondary 62H12; 62P25; 91C20;
D O I
暂无
中图分类号
学科分类号
摘要
Mixture models are among the most popular tools for clustering. However, when the dimension and the number of clusters is large, the estimation of the clusters become challenging, as well as their interpretation. Restriction on the parameters can be used to reduce the dimension. An example is given by mixture of factor analyzers for Gaussian mixtures. The extension of MFA to non-Gaussian mixtures is not straightforward. We propose a new constraint for parameters in non-Gaussian mixture model: the K components parameters are combinations of elements from a small dictionary, say H elements, with H≪K\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H \ll K$$\end{document}. Including a nonnegative matrix factorization (NMF) in the EM algorithm allows us to simultaneously estimate the dictionary and the parameters of the mixture. We propose the acronym NMF-EM for this algorithm, implemented in the R package nmfem. This original approach is motivated by passengers clustering from ticketing data: we apply NMF-EM to data from two Transdev public transport networks. In this case, the words are easily interpreted as typical slots in a timetable.
引用
收藏
页码:231 / 260
页数:29
相关论文
共 98 条
[1]  
Alquier P(2017)An oracle inequality for quasi-Bayesian non-negative matrix factorization Math Methods Stat 26 55-67
[2]  
Guedj B(2009)Data-driven calibration of penalties for least-squares regression J Mach Learn Res 10 245-279
[3]  
Arlot S(2009)Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualization of high-dimensional data IEEE Trans Pattern Anal Mach Intell 32 1298-1309
[4]  
Massart P(2012)Slope heuristics: overview and implementation Stat Comput 22 455-470
[5]  
Baek J(2009)mixtools: An R package for analyzing finite mixture models J Stat Softw 32 1-29
[6]  
McLachlan GJ(1999)An improvement of the NEC criterion for assessing the number of clusters in a mixture model Pattern Recognit Lett 20 267-272
[7]  
Flack LK(2003)Latent Dirichlet allocation J Mach Learn Res 3 993-1022
[8]  
Baudry J-P(2014)Model-based clustering of high-dimensional data: a review Comput Stat Data Anal 71 52-78
[9]  
Maugis C(2015)The discriminative functional mixture model for a comparative analysis of bike sharing systems Ann Appl Stat 9 1726-1760
[10]  
Michel B(2011)Distributed optimization and statistical learning via the alternating direction method of multipliers Found Trends Mach Learn 3 1-122