Interpolating Predictors in High-Dimensional Factor Regression

被引:0
|
作者
Bunea, Florentina [1 ]
Strimas-Mackey, Seth [1 ]
Wegkamp, Marten [1 ,2 ]
机构
[1] Cornell Univ, Dept Stat & Data Sci, Ithaca, NY 14850 USA
[2] Cornell Univ, Dept Math, Ithaca, NY 14850 USA
基金
加拿大自然科学与工程研究理事会;
关键词
Interpolation; minimum-norm predictor; finite sample risk bounds; prediction; factor models; high-dimensional regression; PRINCIPAL COMPONENTS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work studies finite-sample properties of the risk of the minimum-norm interpolating predictor in high-dimensional regression models. If the effective rank of the covariance matrix sigma of the p regression features is much larger than the sample size n, we show that the min-norm interpolating predictor is not desirable, as its risk approaches the risk of trivially predicting the response by 0. However, our detailed finite-sample analysis reveals, surprisingly, that this behavior is not present when the regression response and the features are jointly low-dimensional, following a widely used factor regression model. Within this popular model class, and when the effective rank of sigma is smaller than n, while still allowing for p >> n, both the bias and the variance terms of the excess risk can be controlled, and the risk of the minimum-norm interpolating predictor approaches optimal benchmarks. Moreover, through a detailed analysis of the bias term, we exhibit model classes under which our upper bound on the excess risk approaches zero, while the corresponding upper bound in the recent work Bartlett et al. (2020) diverges. Furthermore, we show that the minimum-norm interpolating predictor analyzed under the factor regression model, despite being model-agnostic and devoid of tuning parameters, can have similar risk to predictors based on principal components regression and ridge regression, and can improve over LASSO based predictors, in the high-dimensional regime.
引用
收藏
页数:60
相关论文
共 50 条
  • [21] Sparse High-Dimensional Isotonic Regression
    Gamarnik, David
    Gaudio, Julia
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [22] Testing covariates in high-dimensional regression
    Lan, Wei
    Wang, Hansheng
    Tsai, Chih-Ling
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2014, 66 (02) : 279 - 301
  • [23] High-Dimensional Constrained Huber Regression
    Wei, Quan
    Zhao, Ziping
    2024 IEEE 13RD SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP, SAM 2024, 2024,
  • [24] Testing covariates in high-dimensional regression
    Wei Lan
    Hansheng Wang
    Chih-Ling Tsai
    Annals of the Institute of Statistical Mathematics, 2014, 66 : 279 - 301
  • [25] Localized Lasso for High-Dimensional Regression
    Yamada, Makoto
    Takeuchi, Koh
    Iwata, Tomoharu
    Shawe-Taylor, John
    Kaski, Samuel
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 325 - 333
  • [26] Nonlinear confounding in high-dimensional regression
    Li, KC
    ANNALS OF STATISTICS, 1997, 25 (02): : 577 - 612
  • [27] High-dimensional quantile tensor regression
    Lu, Wenqi
    Zhu, Zhongyi
    Lian, Heng
    Journal of Machine Learning Research, 2020, 21
  • [28] High-Dimensional Expected Shortfall Regression
    Zhang, Shushu
    He, Xuming
    Tan, Kean Ming
    Zhou, Wen-Xin
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2025,
  • [29] High-dimensional Quantile Tensor Regression
    Lu, Wenqi
    Zhu, Zhongyi
    Lian, Heng
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [30] High-Dimensional Regression with Unknown Variance
    Giraud, Christophe
    Huet, Sylvie
    Verzelen, Nicolas
    STATISTICAL SCIENCE, 2012, 27 (04) : 500 - 518