False discovery rate control for high-dimensional Cox model with uneven data splitting

被引:0
|
作者
Ge, Yeheng [1 ]
Zhang, Sijia [1 ]
Zhang, Xiao [2 ]
机构
[1] Shanghai Univ Finance & Econ, Sch Stat & Management, Shanghai, Peoples R China
[2] Chinese Univ Hong Kong, Sch Data Sci, Shenzhen, Peoples R China
关键词
De-sparsified estimator; false discovery control; symmetric-based statistic; Cox model; PROPORTIONAL HAZARDS MODEL; VARIABLE SELECTION; CONFIDENCE-INTERVALS; REGRESSION; REGIONS; REGULARIZATION; LASSO; TESTS;
D O I
10.1080/00949655.2023.2290135
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Statistical inference for high-dimensional survival data is important for obtaining valid scientific results in many research areas, including biomedical studies and financial risk management. In this paper, a novel framework for feature selection in Cox model is proposed, which achieves false discovery rate (FDR) control asymptotically. The key step is to construct a sequence of ranking statistics based on two independent estimators of the regression coefficients. The FDR control is reached by choosing a data-driven threshold along the ranking of symmetric-based statistics. The de-sparsified estimator and uneven data splitting strategy are employed to improve the robustness of variable selection results and the power in finite sample analysis. We establish the asymptotic FDR control property for the proposed approach at any designated level. Extensive simulation studies and an empirical application on a P2P loan dataset confirm the robustness of the proposed method in FDR control, and show that it often leads to higher power among competitors.
引用
收藏
页码:1462 / 1493
页数:32
相关论文
共 50 条
  • [41] A tradeoff between false discovery and true positive proportions for sparse high-dimensional logistic regression
    Zhou, Jing
    Claeskens, Gerda
    ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (01): : 395 - 428
  • [42] Functional Martingale Residual Process for High-Dimensional Cox Regression with Model Averaging
    He, Baihua
    Liu, Yanyan
    Wu, Yuanshan
    Yin, Guosheng
    Zhao, Xingqiu
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [43] Variable selection for high-dimensional partly linear additive Cox model with application to Alzheimer's disease
    Wu, Qiwei
    Zhao, Hui
    Zhu, Liang
    Sun, Jianguo
    STATISTICS IN MEDICINE, 2020, 39 (23) : 3120 - 3134
  • [44] Extended Bayesian information criterion in the Cox model with a high-dimensional feature space
    Shan Luo
    Jinfeng Xu
    Zehua Chen
    Annals of the Institute of Statistical Mathematics, 2015, 67 : 287 - 311
  • [45] Controlling false positive selections in high-dimensional regression and causal inference
    Buehlmann, Peter
    Ruetimann, Philipp
    Kalisch, Markus
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2013, 22 (05) : 466 - 492
  • [46] Adaptive estimation of the baseline hazard function in the Cox model by model selection, with high-dimensional covariates
    Guilloux, Agathe
    Lernler, Sarah
    Taupin, Marie-Luce
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2016, 171 : 38 - 62
  • [47] GREEDY VARIABLE SELECTION FOR HIGH-DIMENSIONAL COX MODELS
    Lin, Chien-Tong
    Cheng, Yu-Jen
    Ing, Ching-Kang
    STATISTICA SINICA, 2023, 33 : 1697 - 1719
  • [48] Forward regression for Cox models with high-dimensional covariates
    Hong, Hyokyoung G.
    Zheng, Qi
    Li, Yi
    JOURNAL OF MULTIVARIATE ANALYSIS, 2019, 173 : 268 - 290
  • [49] High-dimensional linear model selection motivated by multiple testing
    Furmanczyk, Konrad
    Rejchel, Wojciech
    STATISTICS, 2020, 54 (01) : 152 - 166
  • [50] BOOTSTRAPPING AND SAMPLE SPLITTING FOR HIGH-DIMENSIONAL, ASSUMPTION-LEAN INFERENCE
    Rinaldo, Alessandro
    Wasserman, Larry
    G'Sell, Max
    ANNALS OF STATISTICS, 2019, 47 (06) : 3438 - 3469