Quantile-Composited Feature Screening for Ultrahigh-Dimensional Data

被引:0
作者
Chen, Shuaishuai [1 ]
Lu, Jun [2 ]
机构
[1] Shandong Univ, Sch Math, Jinan 250100, Peoples R China
[2] Natl Univ Def & Technol, Sch Sci, Changsha 410000, Peoples R China
基金
中国国家自然科学基金;
关键词
feature screening; discriminative analysis; quantile-composited; CLASSIFICATION;
D O I
10.3390/math11102398
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Ultrahigh-dimensional grouped data are frequently encountered by biostatisticians working on multi-class categorical problems. To rapidly screen out the null predictors, this paper proposes a quantile-composited feature screening procedure. The new method first transforms the continuous predictor to a Bernoulli variable, by thresholding the predictor at a certain quantile. Consequently, the independence between the response and each predictor is easy to judge, by employing the Pearson chi-square statistic. The newly proposed method has the following salient features: (1) it is robust against high-dimensional heterogeneous data; (2) it is model-free, without specifying any regression structure between the covariate and outcome variable; (3) it enjoys a low computational cost, with the computational complexity controlled at the sample size level. Under some mild conditions, the new method was shown to achieve the sure screening property without imposing any moment condition on the predictors. Numerical studies and real data analyses further confirmed the effectiveness of the new screening procedure.
引用
收藏
页数:21
相关论文
共 26 条
  • [1] MARGINAL EMPIRICAL LIKELIHOOD AND SURE INDEPENDENCE FEATURE SCREENING
    Chang, Jinyuan
    Tang, Cheng Yong
    Wu, Yichao
    [J]. ANNALS OF STATISTICS, 2013, 41 (04) : 2123 - 2148
  • [2] Sparse Discriminant Analysis
    Clemmensen, Line
    Hastie, Trevor
    Witten, Daniela
    Ersboll, Bjarne
    [J]. TECHNOMETRICS, 2011, 53 (04) : 406 - 413
  • [3] Model-Free Feature Screening for Ultrahigh Dimenssional Discriminant Analysis
    Cui, Hengjian
    Li, Runze
    Zhong, Wei
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (510) : 630 - 641
  • [4] BagBoosting for tumor classification with gene expression data
    Dettling, M
    [J]. BIOINFORMATICS, 2004, 20 (18) : 3583 - 3593
  • [5] Sure independence screening for ultrahigh dimensional feature space
    Fan, Jianqing
    Lv, Jinchi
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 : 849 - 883
  • [6] Fan JQ, 2012, J ROY STAT SOC B, V74, P745, DOI 10.1111/j.1467-9868.2012.01029.x
  • [7] Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Varying Coefficient Models
    Fan, Jianqing
    Ma, Yunbei
    Dai, Wei
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (507) : 1270 - 1284
  • [8] Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models
    Fan, Jianqing
    Feng, Yang
    Song, Rui
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) : 544 - 557
  • [9] Threshold Selection in Feature Screening for Error Rate Control
    Guo, Xu
    Ren, Haojie
    Zou, Changliang
    Li, Runze
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (543) : 1773 - 1785
  • [10] QUANTILE-ADAPTIVE MODEL-FREE VARIABLE SCREENING FOR HIGH-DIMENSIONAL HETEROGENEOUS DATA
    He, Xuming
    Wang, Lan
    Hong, Hyokyoung Grace
    [J]. ANNALS OF STATISTICS, 2013, 41 (01) : 342 - 369