Post-selection inference in regression models for group testing data

被引:0
|
作者
Shen, Qinyan [1 ]
Gregory, Karl [1 ]
Huang, Xianzheng [1 ]
机构
[1] Univ South Carolina, Dept Stat, 219 LeConte,1523 Greene St, Columbia, SC 29208 USA
关键词
confidence intervals; EM algorithm; individual testing; LASSO; variable selection; VALID CONFIDENCE-INTERVALS;
D O I
10.1093/biomtc/ujae101
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We develop a methodology for valid inference after variable selection in logistic regression when the responses are partially observed, that is, when one observes a set of error-prone testing outcomes instead of the true values of the responses. Aiming at selecting important covariates while accounting for missing information in the response data, we apply the expectation-maximization algorithm to compute maximum likelihood estimators subject to LASSO penalization. Subsequent to variable selection, we make inferences on the selected covariate effects by extending post-selection inference methodology based on the polyhedral lemma. Empirical evidence from our extensive simulation study suggests that our post-selection inference results are more reliable than those from naive inference methods that use the same data to perform variable selection and inference without adjusting for variable selection.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Variable selection in robust regression models for longitudinal data
    Fan, Yali
    Qin, Guoyou
    Zhu, Zhongyi
    JOURNAL OF MULTIVARIATE ANALYSIS, 2012, 109 : 156 - 167
  • [32] A review of Bayesian group selection approaches for linear regression models
    Lai, Wei-Ting
    Chen, Ray-Bing
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2021, 13 (04)
  • [33] Group subset selection for linear regression
    Guo, Yi
    Berman, Mark
    Gao, Junbin
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 75 : 39 - 52
  • [34] Variable selection in semiparametric regression models for longitudinal data with informative observation times
    Jazi, Omidali Aghababaei
    Pullenayegum, Eleanor
    STATISTICS IN MEDICINE, 2022, 41 (17) : 3281 - 3298
  • [35] Multiple Testing in Regression Models With Applications to Fault Diagnosis in the Big Data Era
    Ing, Ching-Kang
    Lai, Tze Leung
    Shen, Milan
    Tsang, KaWai
    Yu, Shu-Hui
    TECHNOMETRICS, 2017, 59 (03) : 351 - 360
  • [36] EFFICIENT INFERENCE FOR LONGITUDINAL DATA VARYING-COEFFICIENT REGRESSION MODELS
    Li, Rui
    Li, Xiaoli
    Zhou, Xian
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2015, 57 (04) : 545 - 570
  • [37] Model selection and estimation in high dimensional regression models with group SCAD
    Guo, Xiao
    Zhang, Hai
    Wang, Yao
    Wu, Jiang-Lun
    STATISTICS & PROBABILITY LETTERS, 2015, 103 : 86 - 92
  • [38] Post-selection point and interval estimation of signal sizes in Gaussian samples
    Reid, Stephen
    Taylor, Jonathan
    Tibshirani, Robert
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2017, 45 (02): : 128 - 148
  • [39] Variable selection in regression models including functional data predictors
    Liu K.
    Wang S.
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2019, 45 (10): : 1990 - 1994
  • [40] Variable selection for censored data with greedy algorithm based adaptive quantile regression models
    Rahaman Khan, Md Hasinur
    Nishat, Md Nasim Saba
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2025,