Mitigating the impact of measurement error when using penalized regression to model exposure in two-stage air pollution epidemiology studies

被引：11

作者：

Bergen, Silas ^{[1
]}

Szpiro, Adam A. ^{[2
]}

机构：

[1] Winona State Univ, Winona, MN 55987 USA

[2] Univ Washington, Seattle, WA 98195 USA

来源：

ENVIRONMENTAL AND ECOLOGICAL STATISTICS | 2015年 / 22卷 / 03期

关键词：

Measurement error; Penalized regression; PM2.5; Systolic blood pressure; Two-stage modeling; LEAST-SQUARES; ESTIMATOR;

D O I：

10.1007/s10651-015-0314-y

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Air pollution epidemiology studies often implement a two-stage approach. Exposure models are built using observed monitoring data to predict exposure at participant locations where the true exposure is unobserved, and the predictions used to estimate the health effect. This induces measurement error which may bias the estimated health effect and affect its standard error. The impact of measurement error depends on assumed data generating mechanisms and the approach used to estimate and predict exposure. A paradigm wherein the exposure surface is fixed and the subject and monitoring locations are random has been previously motivated, but corresponding measurement error methods exist only when modeling exposure with simple, low-rank, unpenalized regression splines. We develop a comprehensive treatment of measurement error when modeling exposure with high-but-fixed-rank penalized regression splines. If sufficiently rich, these models well-approximate full-rank methods such as universal kriging while remaining asymptotically tractable. We describe the implications of penalization for measurement error, motivate choosing the penalty to optimize health effect inference, derive an asymptotic bias correction, and provide a simple non-parametric bootstrap to account for all sources of variability. We find that highly parameterizing the exposure model results in severely biased and inefficient health effect inference if no penalty is used. Choosing the penalty to mitigate measurement error yields much less bias and better efficiency, and can lead to better confidence interval coverage than other common penalty selection methods. Combining the bias correction with the non-parametric bootstrap yields accurate coverage of nominal 95 % confidence intervals.

引用

页码：601 / 631

页数：31

共 29 条

[1] Partial least squares regression and projection on latent structure regression (PLS Regression) [J].