A Pseudo-Bayesian Shrinkage Approach to Regression with Missing Covariates

被引:1
作者
Zhang, Nanhua [1 ]
Little, Roderick J. [2 ]
机构
[1] Univ S Florida, Coll Publ Hlth, Dept Epidemiol & Biostat, Tampa, FL 33612 USA
[2] Univ Michigan, Sch Publ Hlth, Dept Biostat, Ann Arbor, MI 48109 USA
关键词
Complete-case analysis; Drop variables analysis; Gibbs sampling; Nonignorable modeling; Shrinkage; Variable selection; GENERALIZED LINEAR-MODELS; PATTERN-MIXTURE MODELS; MULTIVARIATE INCOMPLETE DATA; RANDOMIZED PHASE-II; HEPATOCELLULAR-CARCINOMA; VARIABLE SELECTION; LIKELIHOOD; INFERENCE;
D O I
10.1111/j.1541-0420.2011.01718.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We consider the linear regression of outcome Y on regressors W and Z with some values of W missing, when our main interest is the effect of Z on Y, controlling for W. Three common approaches to regression with missing covariates are (i) complete-case analysis (CC), which discards the incomplete cases, and (ii) ignorable likelihood methods, which base inference on the likelihood based on the observed data, assuming the missing data are missing at random (Rubin, 1976b), and (iii) nonignorable modeling, which posits a joint distribution of the variables and missing data indicators. Another simple practical approach that has not received much theoretical attention is to drop the regressor variables containing missing values from the regression modeling (DV, for drop variables). DV does not lead to bias when either (i) the regression coefficient of W is zero or (ii) W and Z are uncorrelated. We propose a pseudo-Bayesian approach for regression with missing covariates that compromises between the CC and DV estimates, exploiting information in the incomplete cases when the data support DV assumptions. We illustrate favorable properties of the method by simulation, and apply the proposed method to a liver cancer study. Extension of the method to more than one missing covariate is also discussed.
引用
收藏
页码:933 / 942
页数:10
相关论文
共 25 条
  • [1] TOBIT MODELS - A SURVEY
    AMEMIYA, T
    [J]. JOURNAL OF ECONOMETRICS, 1984, 24 (1-2) : 3 - 61
  • [2] [Anonymous], 2000, SURV METHODOL
  • [3] Theory and inference for regression models with missing responses and covariates
    Chen, Qingxia
    Ibrahim, Joseph G.
    Chen, Ming-Hui
    Senchaudhuri, Pralay
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2008, 99 (06) : 1302 - 1331
  • [4] Sieve maximum likelihood estimation for regression models with covariates missing at random
    Chen, Qingxia
    Zeng, Donglin
    Ibrahim, Joseph G.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (480) : 1309 - 1317
  • [5] Bias correction in logistic regression with missing categorical covariates
    Das, Ujjwal
    Maiti, Tapabrata
    Pradhan, Vivek
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2010, 140 (09) : 2478 - 2485
  • [6] A RANDOMIZED PHASE-II STUDY OF ACIVICIN AND 4'DEOXYDOXORUBICIN IN PATIENTS WITH HEPATOCELLULAR-CARCINOMA IN AN EASTERN COOPERATIVE ONCOLOGY GROUP-STUDY
    FALKSON, G
    CNAAN, A
    SIMSON, IW
    DAYAL, Y
    FALKSON, H
    SMITH, TJ
    HALLER, DG
    [J]. AMERICAN JOURNAL OF CLINICAL ONCOLOGY-CANCER CLINICAL TRIALS, 1990, 13 (06): : 510 - 515
  • [7] HEPATOCELLULAR-CARCINOMA - AN ECOG RANDOMIZED PHASE-II STUDY OF INTERFERON-BETA AND MENAGORIL
    FALKSON, G
    LIPSITZ, S
    BORDEN, E
    SIMSON, I
    HALLER, D
    [J]. AMERICAN JOURNAL OF CLINICAL ONCOLOGY-CANCER CLINICAL TRIALS, 1995, 18 (04): : 287 - 292
  • [8] VARIABLE SELECTION VIA GIBBS SAMPLING
    GEORGE, EI
    MCCULLOCH, RE
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (423) : 881 - 889
  • [9] HECKMAN JJ, 1976, ANN ECON SOC MEAS, V5, P475
  • [10] Bayesian analysis for generalized linear models with nonignorably missing covariates
    Huang, L
    Chen, MH
    Ibrahim, JG
    [J]. BIOMETRICS, 2005, 61 (03) : 767 - 780