Component selection and variable selection for mixture regression models

被引:1
作者
Qi, Xuefei [1 ]
Xu, Xingbai [2 ]
Feng, Zhenghui [3 ]
Peng, Heng [4 ]
机构
[1] Xiamen Univ, Paula & Gregory Chow Inst Studies Econ, Xiamen 361005, Peoples R China
[2] Xiamen Univ, Wang Yanan Inst Studies Econ WISE, Sch Econ, Dept Stat & Data Sci, Xiamen 361005, Peoples R China
[3] Harbin Inst Technol, Sch Sci, Shenzhen 518055, Peoples R China
[4] Hong Kong Baptist Univ, Dept Math, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Finite mixture regression models; Non-Gaussian; Component selection; Variable selection; NONCONCAVE PENALIZED LIKELIHOOD; FINITE MIXTURE; IDENTIFIABILITY;
D O I
10.1016/j.csda.2024.108124
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Finite mixture regression models are commonly used to account for heterogeneity in populations and situations where the assumptions required for standard regression models may not hold. To expand the range of applicable distributions for components beyond the Gaussian distribution, other distributions, such as the exponential power distribution, the skew-normal distribution, and so on, are explored. To enable simultaneous model estimation, order selection, and variable selection, a penalized likelihood estimation approach that imposes penalties on both the mixing proportions and regression coefficients, which we call the double-penalized likelihood method is proposed in this paper. Four double-penalized likelihood functions and their performance are studied. The consistency of estimators, order selection, and variable selection are investigated. A modified expectation-maximization algorithm is proposed to implement the double-penalized likelihood method. Numerical simulations demonstrate the effectiveness of our proposed method and algorithm. Finally, the results of real data analysis are presented to illustrate the application of our approach. Overall, our study contributes to the development of mixture regression models and provides a useful tool for model and variable selection.
引用
收藏
页数:18
相关论文
共 48 条
[1]   MIXTURE-MODELS, OUTLIERS, AND THE EM ALGORITHM [J].
AITKIN, M ;
WILSON, GT .
TECHNOMETRICS, 1980, 22 (03) :325-331
[2]   A new condition for identifiability of finite mixture distributions [J].
Atienza, N. ;
Garcia-Heras, J. ;
Munoz-Pichardo, J. M. .
METRIKA, 2006, 63 (02) :215-221
[3]  
AZZALINI A, 1985, SCAND J STAT, V12, P171
[4]   Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models [J].
Biernacki, C ;
Celeux, G ;
Govaert, G .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2003, 41 (3-4) :561-575
[5]   Robust Low-Rank Matrix Factorization Under General Mixture Noise Distributions [J].
Cao, Xiangyong ;
Zhao, Qian ;
Meng, Deyu ;
Chen, Yang ;
Xu, Zongben .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (10) :4677-4690
[6]   Penalized minimum-distance estimates in finite mixture models [J].
Chen, JH ;
Kalbfleisch, JD .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1996, 24 (02) :167-175
[7]  
Chen JH, 2008, STAT SINICA, V18, P443
[8]   Order Selection in Finite Mixture Models With a Nonsmooth Penalty [J].
Chen, Jiahua ;
Khalili, Abbas .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (484) :1674-1683
[9]  
Chen X., 2023, J. Multivar. Anal., V195, P105
[10]  
Craven P., 1979, Numerische Mathematik, V31, P377, DOI 10.1007/BF01404567