A nonparametric feature screening method for ultrahigh-dimensional missing response

被引:9
|
作者
Li, Xiaoxia [1 ,2 ]
Tang, Niansheng [1 ,2 ]
Xie, Jinhan [1 ,2 ]
Yan, Xiaodong [1 ,2 ]
机构
[1] Yunnan Univ, Yunnan Key Lab Stat Modeling & Data Anal, Kunming 650500, Yunnan, Peoples R China
[2] Shandong Univ, Sch Econ, Jinan 250100, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature screening; Imputation; Marginal Spearman rank correlation; Missing at random; Ultrahigh-dimensional data; VARIABLE SELECTION; KOLMOGOROV FILTER; MODEL SELECTION; LINEAR-MODELS; LIKELIHOOD; SURVIVAL;
D O I
10.1016/j.csda.2019.106828
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper addresses the feature screening issue for ultrahigh-dimensional data with responses missing at random. A novel nonparametric feature screening procedure is developed to identify the important features via the conditionally imputing marginal Spearman rank correlation. The proposed nonparametric screening approach has several desirable merits. First, it is nonparametric without assuming any regression form of predictors on response variable. Second, it is robust to outliers and heavy-tailed data. Third, under some regularity conditions, it is shown that the proposed feature screening procedure has the sure screening and ranking consistency properties. Simulation studies evidence that the proposed screening procedure outperforms several existing model-free screening procedures. An example taken from the microarray diffuse large-B-cell lymphoma study is used to illustrate the proposed methodologies. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Feature screening and FDR control with knockoff features for ultrahigh-dimensional right-censored data
    Pan, Yingli
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2022, 173
  • [42] Feature Selection for Varying Coefficient Models With Ultrahigh-Dimensional Covariates
    Liu, Jingyuan
    Li, Runze
    Wu, Rongling
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (505) : 266 - 274
  • [43] Robust feature screening for multi-response trans-elliptical regression model with ultrahigh-dimensional covariates
    He, Yong
    Sun, Hao
    Ji, Jiadong
    Zhang, Xinsheng
    RANDOM MATRICES-THEORY AND APPLICATIONS, 2020, 9 (04)
  • [44] Regularized quantile regression for ultrahigh-dimensional data with nonignorable missing responses
    Ding, Xianwen
    Chen, Jiandong
    Chen, Xueping
    METRIKA, 2020, 83 (05) : 545 - 568
  • [45] Conditional screening for ultrahigh-dimensional survival data in case-cohort studies
    Zhang, Jing
    Zhou, Haibo
    Liu, Yanyan
    Cai, Jianwen
    LIFETIME DATA ANALYSIS, 2021, 27 (04) : 632 - 661
  • [46] Non-marginal feature screening for additive hazard model with ultrahigh-dimensional covariates
    Liu, Zili
    Xiong, Zikang
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (06) : 1876 - 1894
  • [47] Conditional distance correlation screening for sparse ultrahigh-dimensional models
    Song, Fengli
    Chen, Yurong
    Lai, Peng
    APPLIED MATHEMATICAL MODELLING, 2020, 81 : 232 - 252
  • [48] FORWARD ADDITIVE REGRESSION FOR ULTRAHIGH-DIMENSIONAL NONPARAMETRIC ADDITIVE MODELS
    Zhong, Wei
    Duan, Sunpeng
    Zhu, Liping
    STATISTICA SINICA, 2020, 30 (01) : 175 - 192
  • [49] Feature screening and variable selection for partially linear models with ultrahigh-dimensional longitudinal data
    Liu, Jingyuan
    NEUROCOMPUTING, 2016, 195 : 202 - 210
  • [50] Feature screening for ultrahigh dimensional binary data
    Guan, Guoyu
    Shan, Na
    Guo, Jianhua
    STATISTICS AND ITS INTERFACE, 2018, 11 (01) : 41 - 50