NONPENALIZED VARIABLE SELECTION IN HIGH-DIMENSIONAL LINEAR MODEL SETTINGS VIA GENERALIZED FIDUCIAL INFERENCE

被引:9
|
作者
Williams, Jonathan P. [1 ]
Hannig, Jan [1 ]
机构
[1] Univ N Carolina, Dept Stat & Operat Res, Chapel Hill, NC 27599 USA
基金
美国国家科学基金会;
关键词
Best subset selection; high-dimensional regression; L-0; minimization; feature selection; REGRESSION;
D O I
10.1214/18-AOS1733
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Standard penalized methods of variable selection and parameter estimation rely on the magnitude of coefficient estimates to decide which variables to include in the final model. However, coefficient estimates are unreliable when the design matrix is collinear. To overcome this challenge, an entirely new perspective on variable selection is presented within a generalized fiducial inference framework. This new procedure is able to effectively account for linear dependencies among subsets of covariates in a high-dimensional setting where p can grow almost exponentially in n, as well as in the classical setting where p <= n. It is shown that the procedure very naturally assigns small probabilities to subsets of covariates which include redundancies by way of explicit L-0 minimization Furthermore, with a typical sparsity assumption, it is shown that the proposed method is consistent in the sense that the probability of the true sparse subset of covariates converges in probability to 1 as n -> infinity, or as n -> infinity and p -> infinity. Very reasonable conditions are needed, and little restriction is placed on the class of possible subsets of covariates to achieve this consistency result.
引用
收藏
页码:1723 / 1753
页数:31
相关论文
共 50 条
  • [21] Variable selection and estimation in high-dimensional models
    Horowitz, Joel L.
    CANADIAN JOURNAL OF ECONOMICS-REVUE CANADIENNE D ECONOMIQUE, 2015, 48 (02): : 389 - 407
  • [22] High-Dimensional Variable Selection for Survival Data
    Ishwaran, Hemant
    Kogalur, Udaya B.
    Gorodeski, Eiran Z.
    Minn, Andy J.
    Lauer, Michael S.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2010, 105 (489) : 205 - 217
  • [23] High-dimensional robust inference for censored linear models
    Huang, Jiayu
    Wu, Yuanshan
    SCIENCE CHINA-MATHEMATICS, 2024, 67 (04) : 891 - 918
  • [24] Debiased lasso after sample splitting for estimation and inference in high-dimensional generalized linear models
    Vazquez, Omar
    Nan, Bin
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2025, 53 (01):
  • [25] Bayesian variable selection in multinomial probit model for classifying high-dimensional data
    Yang, Aijun
    Li, Yunxian
    Tang, Niansheng
    Lin, Jinguan
    COMPUTATIONAL STATISTICS, 2015, 30 (02) : 399 - 418
  • [26] Automatic model selection for high-dimensional survival analysis
    Lang, M.
    Kotthaus, H.
    Marwedel, P.
    Weihs, C.
    Rahnenfuehrer, J.
    Bischl, B.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2015, 85 (01) : 62 - 76
  • [27] A Non-asymptotic Risk Bound for Model Selection in a High-Dimensional Mixture of Experts via Joint Rank and Variable Selection
    TrungTin Nguyen
    Dung Ngoc Nguyen
    Hien Duy Nguyen
    Chamroukhi, Faicel
    ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT II, 2024, 14472 : 234 - 245
  • [28] Posterior model consistency in high-dimensional Bayesian variable selection with arbitrary priors
    Hua, Min
    Goh, Gyuhyeong
    STATISTICS & PROBABILITY LETTERS, 2025, 223
  • [29] SPARSE COVARIANCE THRESHOLDING FOR HIGH-DIMENSIONAL VARIABLE SELECTION
    Jeng, X. Jessie
    Daye, Z. John
    STATISTICA SINICA, 2011, 21 (02) : 625 - 657
  • [30] ON THE COMPUTATIONAL COMPLEXITY OF HIGH-DIMENSIONAL BAYESIAN VARIABLE SELECTION
    Yang, Yun
    Wainwright, Martin J.
    Jordan, Michael I.
    ANNALS OF STATISTICS, 2016, 44 (06) : 2497 - 2532