Application of penalized linear regression methods to the selection of environmental enteropathy biomarkers

被引:22
作者
Lu, Miao [1 ]
Zhou, Jianhui [1 ]
Naylor, Caitlin [2 ]
Kirkpatrick, Beth D. [3 ,4 ]
Haque, Rashidul [5 ]
Petri, William A., Jr. [2 ]
Ma, Jennie Z. [6 ]
机构
[1] Univ Virginia, Dept Stat, Charlottesville, VA USA
[2] Univ Virginia, Sch Med, Div Infect Dis, Charlottesville, VA 22908 USA
[3] Univ Vermont, Dept Med, Coll Med, Burlington, VT 05405 USA
[4] Univ Vermont, Vaccine Testing Ctr, Coll Med, Burlington, VT 05405 USA
[5] Int Ctr Diarrhoeal Dis Res Bangladesh Icddr B, Dhaka, Bangladesh
[6] Univ Virginia, Dept Publ Hlth Sci, Div Biostat, Charlottesville, VA 22903 USA
基金
比尔及梅琳达.盖茨基金会;
关键词
Biomarker selection; Penalized linear regression; Correlated covariates; Malnutrition; Environmental enteropathy; VARIABLE SELECTION; LASSO; SHRINKAGE; RISK;
D O I
10.1186/s40364-017-0089-4
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background: Environmental Enteropathy (EE) is a subclinical condition caused by constant fecal-oral contamination and resulting in blunting of intestinal villi and intestinal inflammation. Of primary interest in the clinical research is to evaluate the association between non-invasive EE biomarkers and malnutrition in a cohort of Bangladeshi children. The challenges are that the number of biomarkers/covariates is relatively large, and some of them are highly correlated. Methods: Many variable selection methods are available in the literature, but which are most appropriate for EE biomarker selection remains unclear. In this study, different variable selection approaches were applied and the performance of these methods was assessed numerically through simulation studies, assuming the correlations among covariates were similar to those in the Bangladesh cohort. The suggested methods from simulations were applied to the Bangladesh cohort to select the most relevant biomarkers for the growth response, and bootstrapping methods were used to evaluate the consistency of selection results. Results: Through simulation studies, SCAD (Smoothly Clipped Absolute Deviation), Adaptive LASSO (Least Absolute Shrinkage and Selection Operator) and MCP (Minimax Concave Penalty) are the suggested variable selection methods, compared to traditional stepwise regression method. In the Bangladesh data, predictors such as mother weight, height-for-age z-score (HAZ) at week 18, and inflammation markers (Myeloperoxidase (MPO) at week 12 and soluable CD14 at week 18) are informative biomarkers associated with children's growth. Conclusions: Penalized linear regression methods are plausible alternatives to traditional variable selection methods, and the suggested methods are applicable to other biomedical studies. The selected early-stage biomarkers offer a potential explanation for the burden of malnutrition problems in low-income countries, allow early identification of infants at risk, and suggest pathways for intervention.
引用
收藏
页数:10
相关论文
共 32 条
[1]  
Akaike H., 1998, Selected papers of Hirotugu Akaike, P199, DOI [10.1007/978-1-4612-1694-0_15, DOI 10.1007/978-1-4612-1694-0_15]
[2]   Prescription-Drug-Related Risk in Driving Comparing Conventional and Lasso Shrinkage Logistic Regressions [J].
Avalos, Marta ;
Adroher, Nuria Duran ;
Lagarde, Emmanuel ;
Thiessard, Frantz ;
Grandvalet, Yves ;
Contrand, Benjamin ;
Orriols, Ludivine .
EPIDEMIOLOGY, 2012, 23 (05) :706-712
[3]  
Bühlmann P, 2014, ANN STAT, V42, P469, DOI 10.1214/13-AOS1175A
[4]   Inflammatory markers predict episodes of wheezing during the first year of life in Bangladesh [J].
Burgess, Stacey L. ;
Lu, Miao ;
Ma, Jennie Z. ;
Naylor, Caitlin ;
Donowitz, Jeffrey R. ;
Kirkpatrick, Beth D. ;
Haque, Rashidul ;
Petri, William A., Jr. .
RESPIRATORY MEDICINE, 2016, 110 :53-57
[5]   Bootstrapping Lasso Estimators [J].
Chatterjee, A. ;
Lahiri, S. N. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) :608-625
[6]  
Craven P., 1979, Numerische Mathematik, V31, P377, DOI 10.1007/BF01404567
[7]   Long-term consequences of stunting in early life [J].
Dewey, Kathryn G. ;
Begum, Khadija .
MATERNAL AND CHILD NUTRITION, 2011, 7 :5-18
[8]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[9]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883
[10]  
Fan JQ, 2010, STAT SINICA, V20, P101