Unsupervised learning of mixture regression models for longitudinal data

被引:5
作者
Xu, Peirong [1 ]
Peng, Heng [2 ]
Huang, Tao [3 ]
机构
[1] Shanghai Normal Univ, Coll Math & Sci, Shanghai, Peoples R China
[2] Hong Kong Baptist Univ, Dept Math, Hong Kong, Hong Kong, Peoples R China
[3] Shanghai Univ Finance & Econ, Sch Stat & Management, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
Unsupervised learning; Model selection; Longitudinal data analysis; Quasi-likelihood; EM algorithm; PRIMARY BILIARY-CIRRHOSIS; LINEAR-MIXED MODELS; EFFICIENT ESTIMATION; DENSITY-ESTIMATION; LIKELIHOOD; MULTIVARIATE; SELECTION; PENALTY; PATIENT;
D O I
10.1016/j.csda.2018.03.012
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper is concerned with learning of mixture regression models for individuals that are measured repeatedly. The adjective "unsupervised" implies that the number of mixing components is unknown and has to be determined, ideally by data driven tools. For this purpose, a novel penalized method is proposed to simultaneously select the number of mixing components and to estimate the mixture proportions and unknown parameters in the models. The proposed method is capable of handling both continuous and discrete responses by only requiring the first two moment conditions of the model distribution. It is shown to be consistent in both selecting the number of components and estimating the mixture proportions and unknown regression parameters. Further, a modified EM algorithm is developed to seamlessly integrate model selection and estimation. Simulation studies are conducted to evaluate the finite sample performance of the proposed procedure. And it is further illustrated via an analysis of a primary biliary cirrhosis data set. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:44 / 56
页数:13
相关论文
共 39 条
  • [1] Clustering using objective functions and stochastic search
    Booth, James G.
    Casella, George
    Hobert, James P.
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 : 119 - 139
  • [2] Mixture of linear mixed models for clustering gene expression profiles from repeated microarray experiments
    Celeux, G
    Martin, O
    Lavergne, C
    [J]. STATISTICAL MODELLING, 2005, 5 (03) : 243 - 267
  • [3] Order Selection in Finite Mixture Models With a Nonsmooth Penalty
    Chen, Jiahua
    Khalili, Abbas
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2009, 104 (485) : 187 - 196
  • [4] DACUNHA- CASTELLE D., 1997, ESAIM Probab. Stat., V1, P285, DOI [DOI 10.1051/PS:1997111, DOI 10.1051/ps:1997111, 10.1051/ps:1997111]
  • [5] Detecting features in spatial point processes with clutter via model-based clustering
    Dasgupta, A
    Raftery, AE
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1998, 93 (441) : 294 - 302
  • [6] Model-based clustering for longitudinal data
    De la Cruz-Mesia, Rolando
    Quintanab, Fernando A.
    Marshall, Guillermo
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (03) : 1441 - 1457
  • [7] PROGNOSIS IN PRIMARY BILIARY-CIRRHOSIS - MODEL FOR DECISION-MAKING
    DICKSON, ER
    GRAMBSCH, PM
    FLEMING, TR
    FISHER, LD
    LANGWORTHY, A
    [J]. HEPATOLOGY, 1989, 10 (01) : 1 - 7
  • [8] Breaking Bad: Two Decades of Life-Course Data Analysis in Criminology, Developmental Psychology, and Beyond
    Erosheva, Elena A.
    Matsueda, Ross L.
    Telesca, Donatello
    [J]. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 1, 2014, 1 : 301 - 332
  • [9] Variable selection via nonconcave penalized likelihood and its oracle properties
    Fan, JQ
    Li, RZ
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) : 1348 - 1360
  • [10] Model-based clustering, discriminant analysis, and density estimation
    Fraley, C
    Raftery, AE
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (458) : 611 - 631