Stable prediction in high-dimensional linear models

被引:18
作者
Lin, Bingqing [1 ]
Wang, Qihua [1 ,2 ]
Zhang, Jun [1 ]
Pang, Zhen [3 ]
机构
[1] Shenzhen Univ, Inst Stat Sci, Coll Math & Stat, Shenzhen 518060, Peoples R China
[2] Chinese Acad Sci, Acad Math & Syst Sci, Beijing 100190, Peoples R China
[3] Hong Kong Polytech Univ, Dept Appl Math, Kowloon, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Model averaging; Variable selection; Penalized regression; Screening; VARIABLE SELECTION; REGRESSION;
D O I
10.1007/s11222-016-9694-6
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We propose a Random Splitting Model Averaging procedure, RSMA, to achieve stable predictions in high-dimensional linear models. The idea is to use split training data to construct and estimate candidate models and use test data to form a second-level data. The second-level data is used to estimate optimal weights for candidate models by quadratic optimization under non-negative constraints. This procedure has three appealing features: (1) RSMA avoids model overfitting, as a result, gives improved prediction accuracy. (2) By adaptively choosing optimal weights, we obtain more stable predictions via averaging over several candidate models. (3) Based on RSMA, a weighted importance index is proposed to rank the predictors to discriminate relevant predictors from irrelevant ones. Simulation studies and a real data analysis demonstrate that RSMA procedure has excellent predictive performance and the associated weighted importance index could well rank the predictors.
引用
收藏
页码:1401 / 1412
页数:12
相关论文
共 50 条
  • [31] GENERALIZED ADDITIVE PARTIAL LINEAR MODELS WITH HIGH-DIMENSIONAL COVARIATES
    Lian, Heng
    Liang, Hua
    ECONOMETRIC THEORY, 2013, 29 (06) : 1136 - 1161
  • [32] Variational Bayesian Inference in High-Dimensional Linear Mixed Models
    Yi, Jieyi
    Tang, Niansheng
    MATHEMATICS, 2022, 10 (03)
  • [33] Variable selection in high-dimensional double generalized linear models
    Xu, Dengke
    Zhang, Zhongzhan
    Wu, Liucang
    STATISTICAL PAPERS, 2014, 55 (02) : 327 - 347
  • [34] Penalised robust estimators for sparse and high-dimensional linear models
    Umberto Amato
    Anestis Antoniadis
    Italia De Feis
    Irene Gijbels
    Statistical Methods & Applications, 2021, 30 : 1 - 48
  • [35] No penalty no tears: Least squares in high-dimensional linear models
    Wang, Xiangyu
    Dunson, David
    Leng, Chenlei
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [36] Penalised robust estimators for sparse and high-dimensional linear models
    Amato, Umberto
    Antoniadis, Anestis'
    De Feis, Italia
    Gijbels, Irene
    STATISTICAL METHODS AND APPLICATIONS, 2021, 30 (01) : 1 - 48
  • [37] Overlapping group lasso for high-dimensional generalized linear models
    Zhou, Shengbin
    Zhou, Jingke
    Zhang, Bo
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2019, 48 (19) : 4903 - 4917
  • [38] Variable selection in high-dimensional double generalized linear models
    Dengke Xu
    Zhongzhan Zhang
    Liucang Wu
    Statistical Papers, 2014, 55 : 327 - 347
  • [39] A comparison study of Bayesian high-dimensional linear regression models
    Shin, Ju-Won
    Lee, Kyoungjae
    KOREAN JOURNAL OF APPLIED STATISTICS, 2021, 34 (03) : 491 - 505
  • [40] Sparsified simultaneous confidence intervals for high-dimensional linear models
    Zhu, Xiaorui
    Qin, Yichen
    Wang, Peng
    METRIKA, 2024,