Predictors with measurement error in mixtures of polynomial regressions

被引:2
作者
Fang, Xiaoqiong [1 ]
Chen, Andy W. [2 ]
Young, Derek S. [3 ]
机构
[1] Corp & Investment Bank JP Morgan, Brooklyn, NY USA
[2] Seattle Pacific Univ, Sch Business Govt & Econ, Seattle, WA USA
[3] Univ Kentucky, Dr Bing Zhang Dept Stat, Lexington, KY 40546 USA
关键词
Bootstrap; Finite mixture models; GEM algorithm; Model selection; Regression calibration; Surrogate data; COVARIATE MEASUREMENT ERROR; MAXIMUM-LIKELIHOOD; LOGISTIC-REGRESSION; MODELS; IDENTIFIABILITY; INFERENCE; SELECTION;
D O I
10.1007/s00180-022-01232-5
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
There has been a substantial body of research on mixtures-of-regressions models that has developed over the past 20 years. While much of the recent literature has focused on flexible mixtures-of-regressions models, there is still considerable utility for imposing structure on the mixture components through fully parametric models. One feature of the data that is scantly addressed in mixtures of regressions is the presence of measurement error in the predictors. The limited existing research on this topic concerns the case where classical measurement error is added to the classic mixtures-of-linear-regressions model. In this paper, we consider the setting of mixtures of polynomial regressions where the predictors are subject to classical measurement error. Moreover, each component is allowed to have a different degree for the polynomial structure. We utilize a generalized expectation-maximization algorithm for performing maximum likelihood estimation. For estimating standard errors, we extend a semiparametric bootstrap routine that has been employed for mixtures of linear regressions without measurement error in the predictors. Numeric work, for practical reasons identified, is limited to estimating two-component models. We consider a likelihood ratio test for determining if there is a higher-degree polynomial term in one of the components. Model selection criteria are also highlighted as a way for determining an appropriate model. A simulation study and an application to the classic nitric oxide emissions data are provided.
引用
收藏
页码:373 / 401
页数:29
相关论文
共 68 条
[1]   A general maximum likelihood analysis of measurement error in generalized linear models [J].
Aitkin, M ;
Rocci, R .
STATISTICS AND COMPUTING, 2002, 12 (02) :163-174
[2]  
Akaike H., 1998, International Symposium on Information Theory, Budapest, Proceedings, P199, DOI DOI 10.1007/978-1-4612-1694-015
[3]  
[Anonymous], 2008, Recent Advances in Linear Models and Related Areas, DOI [DOI 10.1007/978-3-7908-2064-511, DOI 10.1007/978-3-7908-2064-5_11]
[4]  
Benaglia T, 2009, J STAT SOFTW, V32, P1
[5]   Assessing a mixture model for clustering with the integrated completed likelihood [J].
Biernacki, C ;
Celeux, G ;
Govaert, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (07) :719-725
[6]   A Unified Approach to Measurement Error and Missing Data: Overview and Applications [J].
Blackwell, Matthew ;
Honaker, James ;
King, Gary .
SOCIOLOGICAL METHODS & RESEARCH, 2017, 46 (03) :303-341
[7]   Semiparametric estimation of a two-component mixture model where one component is known [J].
Bordes, Laurent ;
Delmas, Celine ;
Vandekerkhove, Pierre .
SCANDINAVIAN JOURNAL OF STATISTICS, 2006, 33 (04) :733-752
[9]  
Brinkman, 1981, SOC AUTOMOTIVE ENG T
[10]  
Burnham K. P., 2002, Model selection and multimodel inference: a practical information-theoretic approach, V2nd ed, DOI 10.1007/b97636