Variable selection models based on multiple imputation with an application for predicting median effective dose and maximum effect

被引:12
作者
Wan, Y. [1 ]
Datta, S. [1 ]
Conklin, D. J. [2 ]
Kong, M. [1 ]
机构
[1] Univ Louisville, Dept Bioinformat & Biostat, Louisville, KY 40292 USA
[2] Univ Louisville, Dept Med, Div Cardiovasc Med, Louisville, KY 40292 USA
关键词
penalized least squares; multiple imputation; elastic net; variable selection; ENDOTHELIAL DYSFUNCTION; VASCULAR INFLAMMATION; INSULIN-RESISTANCE; REGRESSION; REGULARIZATION;
D O I
10.1080/00949655.2014.907801
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The statistical methods for variable selection and prediction could be challenging when missing covariates exist. Although multiple imputation (MI) is a universally accepted technique for solving missing data problem, how to combine the MI results for variable selection is not quite clear, because different imputations may result in different selections. The widely applied variable selection methods include the sparse partial least-squares (SPLS) method and the penalized least-squares method, e.g. the elastic net (ENet) method. In this paper, we propose an MI-based weighted elastic net (MI-WENet) method that is based on stacked MI data and a weighting scheme for each observation in the stacked data set. In the MI-WENet method, MI accounts for sampling and imputation uncertainty for missing values, and the weight accounts for the observed information. Extensive numerical simulations are carried out to compare the proposed MI-WENet method with the other competing alternatives, such as the SPLS and ENet. In addition, we applied the MI-WENet method to examine the predictor variables for the endothelial function that can be characterized by median effective dose (ED50) and maximum effect (Emax) in an ex-vivo phenylephrine-induced extension and acetylcholine-induced relaxation experiment.
引用
收藏
页码:1902 / 1916
页数:15
相关论文
共 32 条
  • [1] [Anonymous], COMPUT STAT DATA ANA
  • [2] [Anonymous], 1987, MULTIPLE IMPUTATION
  • [3] [Anonymous], STAT MED
  • [4] Sensitivity analysis after multiple imputation under missing at random: a weighting approach
    Carpenter, James R.
    Kenward, Michael G.
    White, Ian R.
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2007, 16 (03) : 259 - 275
  • [5] Sparse partial least squares regression for simultaneous dimension reduction and variable selection
    Chun, Hyonho
    Keles, Suenduez
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2010, 72 : 3 - 25
  • [6] Cohen J., 2003, APPL MULTIPLE REGRES, P3
  • [7] Glutathione-S-transferase P protects against endothelial dysfunction induced by exposure to tobacco smoke
    Conklin, Daniel J.
    Haberzettl, Petra
    Prough, Russell A.
    Bhatnagar, Aruni
    [J]. AMERICAN JOURNAL OF PHYSIOLOGY-HEART AND CIRCULATORY PHYSIOLOGY, 2009, 296 (05): : H1586 - H1597
  • [8] Exploring relationships in gene expressions: A partial least squares approach
    Datta, S
    [J]. GENE EXPRESSION, 2001, 9 (06): : 249 - 255
  • [9] Role of endothelial dysfunction in atherosclerosis
    Davignon, J
    Ganz, P
    [J]. CIRCULATION, 2004, 109 (23) : 27 - 32
  • [10] Regularization Paths for Generalized Linear Models via Coordinate Descent
    Friedman, Jerome
    Hastie, Trevor
    Tibshirani, Rob
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01): : 1 - 22