Estimation and group-feature selection in sparse mixture-of-experts with diverging number of parameters

被引：0

作者：

Khalili, Abbas ^{[1
]}

Yang, Archer Yi ^{[1
,2
]}

Da, Xiaonan ^{[3
]}

机构：

[1] McGill Univ, Dept Math & Stat, Montreal, PQ, Canada

[2] Mila Quebec AI Inst, Montreal, PQ, Canada

[3] Stat Canada, Ottawa, ON, Canada

来源：

JOURNAL OF STATISTICAL PLANNING AND INFERENCE | 2025年 / 237卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Regularization; Variable selection; Mixture-of-experts; NONCONCAVE PENALIZED LIKELIHOOD; MAXIMUM-LIKELIHOOD; VARIABLE SELECTION; FINITE MIXTURE; REGRESSION-MODELS; EM ALGORITHM; IDENTIFIABILITY; REGULARIZATION;

D O I：

10.1016/j.jspi.2024.106250

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Mixture-of-experts provide flexible statistical models for a wide range of regression (supervised learning) problems. Often a large number of covariates (features) are available in many modern applications yet only a small subset of them is useful in explaining a response variable of interest. This calls for a feature selection device. In this paper, we present new group- feature selection and estimation methods for sparse mixture-of-experts models when the number of features can be nearly comparable to the sample size. We prove the consistency of the methods in both parameter estimation and feature selection. We implement the methods using a modified EM algorithm combined with proximal gradient method which results in a convenient closed-form parameter update in the M-step of the algorithm. We examine the finite-sample performance of the methods through simulations, and demonstrate their applications in a real data example on exploring relationships in body measurements.

引用

页数：17

共 48 条

[1] STATISTICAL GUARANTEES FOR THE EM ALGORITHM: FROM POPULATION TO SAMPLE-BASED ANALYSIS
Balakrishnan, Sivaraman
Wainwrightt, Martin J.
Yu, Bin
[J]. ANNALS OF STATISTICS, 2017, 45 (01) : 77 - 120
[2] VALID POST-SELECTION INFERENCE
Berk, Richard
Brown, Lawrence
Buja, Andreas
Zhang, Kai
Zhao, Linda
[J]. ANNALS OF STATISTICS, 2013, 41 (02) : 802 - 837
[3] Boyd S., 2004, Convex Optimization
[4] Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors
Breheny, Patrick
Huang, Jian
[J]. STATISTICS AND COMPUTING, 2015, 25 (02) : 173 - 187
[5] Chamroukhi F, 2019, J SFDS, V160, P57
[6] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
DEMPSTER, AP
LAIRD, NM
RUBIN, DB
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
[7] IDEAL SPATIAL ADAPTATION BY WAVELET SHRINKAGE
DONOHO, DL
JOHNSTONE, IM
[J]. BIOMETRIKA, 1994, 81 (03) : 425 - 455
[8] Fan J., 2020, STAT FDN DATA SCI
[9] Nonconcave penalized likelihood with a diverging number of parameters
Fan, JQ
Peng, H
[J]. ANNALS OF STATISTICS, 2004, 32 (03) : 928 - 961
[10] Variable selection via nonconcave penalized likelihood and its oracle properties
Fan, JQ
Li, RZ
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) : 1348 - 1360

← 1 2 3 4 5 →