Characterization of Weighted Quantile Sum Regression for Highly Correlated Data in a Risk Analysis Setting

被引:921
作者
Carrico, Caroline [1 ]
Gennings, Chris [1 ]
Wheeler, David C. [1 ]
Factor-Litvak, Pam [2 ]
机构
[1] Virginia Commonwealth Univ, Sch Med, Dept Biostat, Richmond, VA 23284 USA
[2] Columbia Univ, Mailman Sch Publ Hlth, Dept Epidemiol, New York, NY USA
关键词
Correlation; Nonlinear model; WQS; Subset selection; Variable selection; URINARY PHTHALATE METABOLITES; AIR-POLLUTION; OXIDATIVE STRESS; EXPOSURE; EXPOSOME; HEALTH; REGULARIZATION; SELECTION;
D O I
10.1007/s13253-014-0180-3
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In risk evaluation, the effect of mixtures of environmental chemicals on a common adverse outcome is of interest. However, due to the high dimensionality and inherent correlations among chemicals that occur together, the traditional methods (e.g. ordinary or logistic regression) suffer from collinearity and variance inflation, and shrinkage methods have limitations in selecting among correlated components. We propose a weighted quantile sum (WQS) approach to estimating a body burden index, which identifies "bad actors" in a set of highly correlated environmental chemicals. We evaluate and characterize the accuracy of WQS regression in variable selection through extensive simulation studies through sensitivity and specificity (i.e., ability of the WQS method to select the bad actors correctly and not incorrect ones). We demonstrate the improvement in accuracy this method provides over traditional ordinary regression and shrinkage methods (lasso, adaptive lasso, and elastic net). Results from simulations demonstrate that WQS regression is accurate under some environmentally relevant conditions, but its accuracy decreases for a fixed correlation pattern as the association with a response variable diminishes. Nonzero weights (i.e., weights exceeding a selection threshold parameter) may be used to identify bad actors; however, components within a cluster of highly correlated active components tend to have lower weights, with the sum of their weights representative of the set. Supplementary materials accompanying this paper appear on-line.
引用
收藏
页码:100 / 120
页数:21
相关论文
共 31 条
[1]  
[Anonymous], INT J HYGIENE ENV HL
[2]  
[Anonymous], SAS 9 2 HELP DOC
[3]  
[Anonymous], NAT HLTH NUTR EX STU
[4]  
[Anonymous], 1997, Matrix Algebra From a Statistician's Perspective
[5]  
[Anonymous], ENV MOL MUTAGENESIS
[6]   Estimating the Health Effects of Exposure to Multi-Pollutant Mixture [J].
Billionnet, Cecile ;
Sherrill, Duane ;
Annesi-Maesano, Isabella .
ANNALS OF EPIDEMIOLOGY, 2012, 22 (02) :126-141
[7]  
Breiman L, 1996, MACH LEARN, V24, P49
[8]   Organochlorines in carpet dust and non-Hodgkin lymphoma [J].
Colt, JS ;
Severson, RK ;
Lubin, J ;
Rothman, N ;
Camann, D ;
Davis, S ;
Cerhan, JR ;
Cozen, W ;
Hartge, P .
EPIDEMIOLOGY, 2005, 16 (04) :516-525
[9]   Protecting Human Health From Air Pollution Shifting From a Single-pollutant to a Multipollutant Approach [J].
Dominici, Francesca ;
Peng, Roger D. ;
Barr, Christopher D. ;
Bell, Michelle L. .
EPIDEMIOLOGY, 2010, 21 (02) :187-194
[10]   Exploration of Oxidative Stress and Inflammatory Markers in Relation to Urinary Phthalate Metabolites: NHANES 1999-2006 [J].
Ferguson, Kelly K. ;
Loch-Caruso, Rita ;
Meeker, John D. .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2012, 46 (01) :477-485