Seemingly unrelated clusterwise linear regression for contaminated data

被引:3
|
作者
Perrone, Gabriele [1 ]
Soffritti, Gabriele [1 ]
机构
[1] Alma Mater Studiorum Univ Bologna, Dept Stat Sci, Via Belle Arti 41, I-40126 Bologna, Italy
关键词
Contaminated Gaussian distribution; ECM algorithm; Mild outlier; Mixture of regression models; Model-based cluster analysis; Seemingly unrelated regression; MAXIMUM-LIKELIHOOD-ESTIMATION; MIXTURE REGRESSION; EM ALGORITHM; MODELS; VALUES;
D O I
10.1007/s00362-022-01344-6
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Clusterwise regression is an approach to regression analysis based on finite mixtures which is generally employed when sample observations come from a population composed of several unknown sub-populations. Whenever the response is continuous, Gaussian clusterwise linear regression models are usually employed. Such models have been recently robustified with respect to the possible presence of mild outliers in the sub-populations. However, in some fields of research, especially in the modelling of multivariate economic data or data from the social sciences, there may be prior information on the specific covariates to be considered in the linear term employed in the prediction of a certain response. As a consequence, covariates may not be the same for all responses. Thus, a novel class of multivariate Gaussian linear clusterwise regression models is proposed. This class provides an extension to mixture-based regression analysis for modelling multivariate and correlated responses in the presence of mild outliers that let the researcher free to use a different vector of covariates for each response. Details about the model identification and maximum likelihood estimation via an expectation-conditional maximisation algorithm are given. The performance of the new models is studied by simulation in comparison with other clusterwise linear regression models. A comparative evaluation of their effectiveness and usefulness is provided through the analysis of a real dataset.
引用
收藏
页码:883 / 921
页数:39
相关论文
共 50 条