Clusterwise elastic-net regression based on a combined information criterion

被引:0
作者
Xavier Bry
Ndèye Niang
Thomas Verron
Stéphanie Bougeard
机构
[1] University of Montpellier,
[2] IMAG,undefined
[3] CEDRIC CNAM,undefined
[4] DANAIS,undefined
[5] Anses (French Agency for Food,undefined
[6] Environmental and Occupational Health Safety),undefined
来源
Advances in Data Analysis and Classification | 2023年 / 17卷
关键词
Clusterwise regression; Typological regression; regularization; Multicollinearity; Ridge regression; Elastic-net regularization; 62H30; 62H25; 91C20;
D O I
暂无
中图分类号
学科分类号
摘要
Many research questions pertain to a regression problem assuming that the population under study is not homogeneous with respect to the underlying model. In this setting, we propose an original method called Combined Information criterion CLUSterwise elastic-net regression (Ciclus). This method handles several methodological and application-related challenges. It is derived from both the information theory and the microeconomic utility theory and maximizes a well-defined criterion combining three weighted sub-criteria, each being related to a specific aim: getting a parsimonious partition, compact clusters for a better prediction of cluster-membership, and a good within-cluster regression fit. The solving algorithm is monotonously convergent, under mild assumptions. The Ciclus principle provides an innovative solution to two key issues: (i) the automatic optimization of the number of clusters, (ii) the proposal of a prediction model. We applied it to elastic-net regression in order to be able to manage high-dimensional data involving redundant explanatory variables. Ciclus is illustrated through both a simulation study and a real example in the field of omic data, showing how it improves the quality of the prediction and facilitates the interpretation. It should therefore prove useful whenever the data involve a population mixture as for example in biology, social sciences, economics or marketing.
引用
收藏
页码:75 / 107
页数:32
相关论文
共 96 条
[1]  
Ahonen I(2019)Prediction with a flexible finite mixture-of-regressions Comput Stat Data Anal 132 212-224
[2]  
Nevalainen J(2000)Generalized discriminant analysis using a kernel approach Neural Comput 12 2385-2404
[3]  
Larocque D(2018)A new micro-batch approach for partial least square clusterwise regression Procedia Comput Sci 144 239-250
[4]  
Baudat G(2000)Assessing a mixture model for clustering with the integrated completed likelihood IEEE T Pattern Anal 22 719-725
[5]  
Anouar F(2017)Clusterwise analysis for multiblock component methods Adv Data Anal Classif 12 285-313
[6]  
Beck G(2018)Prediction for regularized clusterwise multiblock regression Appl Stoch Model Bus 34 852-867
[7]  
Azzag H(2003)Multicriterion clusterwise regression for joint segmentation settings: an application to customer value J Mark Res 40 225-234
[8]  
Bougeard S(2008)Cautionary remarks on the use of clusterwise regression Multivar Behav Res 43 29-49
[9]  
Lebbah M(2012)THEME-SEER: a multidimensional exploratory technique to analyze a structural model using an extended covariance criterion J Chemom 26 158-169
[10]  
Niang N(2014)Nbclust: an r package for determining the relevant number of clusters in a data set J Stat Softw 61 1-36