A selective overview of feature screening for ultrahigh-dimensional data

被引:0
作者
JingYuan Liu
Wei Zhong
RunZe Li
机构
[1] Xiamen University,Department of Statistics, School of Economics
[2] Xiamen University,Wang Yanan Institute for Studies in Economics
[3] Xiamen University,Fujian Key Laboratory of Statistical Science
[4] Pennsylvania State University,Department of Statistics and The Methodology Center
来源
Science China Mathematics | 2015年 / 58卷
关键词
correlation learning; distance correlation; sure independence screening; sure joint screening; sure screening property; ultrahigh-dimensional data; 62H12; 62H20;
D O I
暂无
中图分类号
学科分类号
摘要
High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data. Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many highdimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.
引用
收藏
页码:1 / 22
页数:21
相关论文
共 50 条
  • [31] Covariate Information Number for Feature Screening in Ultrahigh-Dimensional Supervised Problems
    Nandy, Debmalya
    Chiaromonte, Francesca
    Li, Runze
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (539) : 1516 - 1529
  • [32] Feature Screening for Ultrahigh-dimensional Censored Data with Varying Coefficient Single-index Model
    Yi Liu
    Acta Mathematicae Applicatae Sinica, English Series, 2019, 35 : 845 - 861
  • [33] Robust conditional nonparametric independence screening for ultrahigh-dimensional data
    Zhang, Shucong
    Pan, Jing
    Zhou, Yong
    STATISTICS & PROBABILITY LETTERS, 2018, 143 : 95 - 101
  • [34] Conditional screening for ultrahigh-dimensional survival data in case-cohort studies
    Zhang, Jing
    Zhou, Haibo
    Liu, Yanyan
    Cai, Jianwen
    LIFETIME DATA ANALYSIS, 2021, 27 (04) : 632 - 661
  • [35] A new nonparametric screening method for ultrahigh-dimensional survival data
    Liu, Yanyan
    Zhang, Jing
    Zhao, Xingqiu
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 119 : 74 - 85
  • [36] Conditional screening for ultrahigh-dimensional survival data in case-cohort studies
    Jing Zhang
    Haibo Zhou
    Yanyan Liu
    Jianwen Cai
    Lifetime Data Analysis, 2021, 27 : 632 - 661
  • [37] Feature screening in ultrahigh-dimensional partially linear models with missing responses at random
    Tang, Niansheng
    Xia, Linli
    Yan, Xiaodong
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2019, 133 : 208 - 227
  • [38] Nonparametric screening and feature selection for ultrahigh-dimensional Case II interval-censored failure time data
    Hu, Qiang
    Zhu, Liang
    Liu, Yanyan
    Sun, Jianguo
    Srivastava, Deo Kumar
    Robison, Leslie L.
    BIOMETRICAL JOURNAL, 2020, 62 (08) : 1909 - 1925
  • [39] Non-marginal feature screening for additive hazard model with ultrahigh-dimensional covariates
    Liu, Zili
    Xiong, Zikang
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (06) : 1876 - 1894
  • [40] Stable feature screening for ultrahigh dimensional data
    Peng Lai
    Fengli Song
    Yufei Gao
    Journal of the Korean Statistical Society, 2019, 48 : 221 - 232