Repeated holdout validation for weighted quantile sum regression

被引:150
作者
Tanner, Eva M. [1 ]
Bornehag, Carl-Gustaf [1 ,2 ]
Gennings, Chris [1 ]
机构
[1] Icahn Sch Med Mt Sinai, New York, NY 10029 USA
[2] Karlstad Univ, Karlstad, Sweden
关键词
Environmental epidemiology; Chemical mixtures; Cross-validation; Bootstrap; Uncertainty plot; Chemical of concern;
D O I
10.1016/j.mex.2019.11.008
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Weighted Quantile Sum (WQS) regression is a method commonly used in environmental epidemiology to assess the impact of chemical mixtures in relation to a health outcome of interest. Data are partitioned into a single training and test set to reduce sample-specific chemical weights. However, in typical epidemiology sample sizes, this may produce unstable chemical weights and WQS index estimates, and investigators may resort to training and testing on the same data. To solve this problem, we propose repeated holdout validation whereby data are randomly partitioned 100 times, producing a distribution of validated results. Taking the mean as the final estimate, confidence estimates may also be calculated for inference. Further, this method helps characterize the variability in chemical weights, aiding in the identification of chemicals of concern. This is important since it may direct future research into specific chemicals. Using data from 718 mother-child pairs in the Swedish Environmental Longitudinal, Mother and Child, Asthma and Allergy (SELMA) study, we assessed the association between prenatal exposure to 26 endocrine disrupting chemicals and child Intelligence Quotient (IQ). Results using a single partition were unstable, varying by random seed. The WQS index estimate was significant when all data was used (e.g. no partition) (beta = -2.2 CI = -3.43, -0.98), but attenuated and nonsignificant using repeated holdout validation (beta = -0.82 CI = -2.11, 0.45). When implementing WQS in epidemiologic studies with limited sample sizes, repeated holdout validation is a viable alternative to using a single, or no partitioning. Repeated holdout can both stabilize results and help characterize the uncertainty in identifying chemicals of concern, while maintaining some of the the rigor of holdout validation. Repeated holdout validation improves the stability of WQS estimates in finite study samples Uncertainty in identifying toxic chemicals of concern is acknowledged and characterized (C) 2019 The Author(s). Published by Elsevier B.V.
引用
收藏
页码:2855 / 2860
页数:6
相关论文
共 11 条
[1]   Extending the Distributed Lag Model framework to handle chemical mixtures [J].
Bello, Ghalib A. ;
Arora, Manish ;
Austin, Christine ;
Horton, Megan K. ;
Wright, Robert O. ;
Gennings, Chris .
ENVIRONMENTAL RESEARCH, 2017, 156 :253-264
[2]   The SELMA Study: A Birth Cohort Study in Sweden Following More Than 2000 Mother-Child Pairs [J].
Bornehag, Carl-Gustaf ;
Moniruzzaman, Syed ;
Larsson, Malin ;
Lindstrom, Cecilia Boman ;
Hasselgren, Mikael ;
Bodin, Anna ;
von Kobyletzkic, Laura B. ;
Carlstedt, Fredrik ;
Lundin, Fredrik ;
Nanberg, Eewa ;
Jonsson, Bo A. G. ;
Sigsgaard, Torben ;
Janson, Staffan .
PAEDIATRIC AND PERINATAL EPIDEMIOLOGY, 2012, 26 (05) :456-467
[3]  
Borovicka T., 2012, ADV DATA MINING KNOW, DOI [10.5772/50787, DOI 10.5772/50787]
[4]   Characterization of Weighted Quantile Sum Regression for Highly Correlated Data in a Risk Analysis Setting [J].
Carrico, Caroline ;
Gennings, Chris ;
Wheeler, David C. ;
Factor-Litvak, Pam .
JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2015, 20 (01) :100-120
[5]   A random subset implementation of weighted quantile sum (WQSRS) regression for analysis of high-dimensional mixtures [J].
Curtin, Paul ;
Kellogg, Joshua ;
Cech, Nadja ;
Gennings, Chris .
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (04) :1099-1114
[6]   Importance of being uncertain [J].
Krzywinski, Martin ;
Altman, Naomi .
NATURE METHODS, 2013, 10 (09) :809-810
[7]   A generalized weighted quantile sum approach for analyzing correlated data in the presence of interactions [J].
Lee, MinJae ;
Rahbar, Mohammad H. ;
Samms-Vaughan, Maureen ;
Bressler, Jan ;
Bach, MacKinsey A. ;
Hessabi, Manouchehr ;
Grove, Megan L. ;
Shakespeare-Pellington, Sydonnie ;
Desai, Charlene Coore ;
Reece, Jody-Ann ;
Loveland, Katherine A. ;
Boerwinkle, Eric .
BIOMETRICAL JOURNAL, 2019, 61 (04) :934-954
[8]   Stability selection [J].
Meinshausen, Nicolai ;
Buehlmann, Peter .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2010, 72 :417-473
[9]  
RCore Team, 2018, R LANG ENV STAT COMP
[10]   To Explain or to Predict? [J].
Shmueli, Galit .
STATISTICAL SCIENCE, 2010, 25 (03) :289-310