NONPENALIZED VARIABLE SELECTION IN HIGH-DIMENSIONAL LINEAR MODEL SETTINGS VIA GENERALIZED FIDUCIAL INFERENCE

被引:9
|
作者
Williams, Jonathan P. [1 ]
Hannig, Jan [1 ]
机构
[1] Univ N Carolina, Dept Stat & Operat Res, Chapel Hill, NC 27599 USA
基金
美国国家科学基金会;
关键词
Best subset selection; high-dimensional regression; L-0; minimization; feature selection; REGRESSION;
D O I
10.1214/18-AOS1733
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Standard penalized methods of variable selection and parameter estimation rely on the magnitude of coefficient estimates to decide which variables to include in the final model. However, coefficient estimates are unreliable when the design matrix is collinear. To overcome this challenge, an entirely new perspective on variable selection is presented within a generalized fiducial inference framework. This new procedure is able to effectively account for linear dependencies among subsets of covariates in a high-dimensional setting where p can grow almost exponentially in n, as well as in the classical setting where p <= n. It is shown that the procedure very naturally assigns small probabilities to subsets of covariates which include redundancies by way of explicit L-0 minimization Furthermore, with a typical sparsity assumption, it is shown that the proposed method is consistent in the sense that the probability of the true sparse subset of covariates converges in probability to 1 as n -> infinity, or as n -> infinity and p -> infinity. Very reasonable conditions are needed, and little restriction is placed on the class of possible subsets of covariates to achieve this consistency result.
引用
收藏
页码:1723 / 1753
页数:31
相关论文
共 50 条
  • [41] Variable selection and subgroup analysis for high-dimensional censored data
    Zhang, Yu
    Wang, Jiangli
    Zhang, Weiping
    STATISTICAL THEORY AND RELATED FIELDS, 2024, 8 (03) : 211 - 231
  • [42] Variable selection in high-dimensional quantile varying coefficient models
    Tang, Yanlin
    Song, Xinyuan
    Wang, Huixia Judy
    Zhu, Zhongyi
    JOURNAL OF MULTIVARIATE ANALYSIS, 2013, 122 : 115 - 132
  • [43] Variable selection and estimation for high-dimensional spatial autoregressive models
    Cai, Liqian
    Maiti, Tapabrata
    SCANDINAVIAN JOURNAL OF STATISTICS, 2020, 47 (02) : 587 - 607
  • [44] RANKING-BASED VARIABLE SELECTION FOR HIGH-DIMENSIONAL DATA
    Baranowski, Rafal
    Chen, Yining
    Fryzlewicz, Piotr
    STATISTICA SINICA, 2020, 30 (03) : 1485 - 1516
  • [45] Sparse Bayesian variable selection for classifying high-dimensional data
    Yang, Aijun
    Lian, Heng
    Jiang, Xuejun
    Liu, Pengfei
    STATISTICS AND ITS INTERFACE, 2018, 11 (02) : 385 - 395
  • [46] A Robust Supervised Variable Selection for Noisy High-Dimensional Data
    Kalina, Jan
    Schlenker, Anna
    BIOMED RESEARCH INTERNATIONAL, 2015, 2015
  • [47] Clustering high-dimensional data via feature selection
    Liu, Tianqi
    Lu, Yu
    Zhu, Biqing
    Zhao, Hongyu
    BIOMETRICS, 2023, 79 (02) : 940 - 950
  • [48] Inference on Treatment Effects after Selection among High-Dimensional ControlsaEuro
    Belloni, Alexandre
    Chernozhukov, Victor
    Hansen, Christian
    REVIEW OF ECONOMIC STUDIES, 2014, 81 (02) : 608 - 650
  • [49] High dimensional variable selection via tilting
    Cho, Haeran
    Fryzlewicz, Piotr
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2012, 74 : 593 - 622
  • [50] A Simple Information Criterion for Variable Selection in High-Dimensional Regression
    Pluntz, Matthieu
    Dalmasso, Cyril
    Tubert-Bitter, Pascale
    Ahmed, Ismail
    STATISTICS IN MEDICINE, 2025, 44 (1-2)