Variable selection for both outcomes and predictors: sparse multivariate principal covariates regression

被引:0
|
作者
Park, Soogeun [1 ]
Ceulemans, Eva [2 ]
Van Deun, Katrijn [1 ]
机构
[1] Tilburg Univ, Tilburg, Netherlands
[2] Katholieke Univ Leuven, Leuven, Belgium
关键词
Outcome variable selection; Response variable selection; Response selection; Variable selection; Principal covariates regression; Dimension reduction; MODEL SELECTION; MULTI-TRAIT; COMPONENTS; ALGORITHM; NUMBER; GWAS;
D O I
10.1007/s10994-024-06520-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Datasets comprised of large sets of both predictor and outcome variables are becoming more widely used in research. In addition to the well-known problems of model complexity and predictor variable selection, predictive modelling with such large data also presents a relatively novel and under-studied challenge of outcome variable selection. Certain outcome variables in the data may not be adequately predicted by the given sets of predictors. In this paper, we propose the method of Sparse Multivariate Principal Covariates Regression that addresses these issues altogether by expanding the Principal Covariates Regression model to incorporate sparsity penalties on both of predictor and outcome variables. Our method is one of the first methods that perform variable selection for both predictors and outcomes simultaneously. Moreover, by relying on summary variables that explain the variance in both predictor and outcome variables, the method offers a sparse and succinct model representation of the data. In a simulation study, the method performed better than methods with similar aims such as sparse Partial Least Squares at prediction of the outcome variables and recovery of the population parameters. Lastly, we administered the method on an empirical dataset to illustrate its application in practice.
引用
收藏
页码:7319 / 7370
页数:52
相关论文
共 50 条
  • [1] Variable selection in multivariate regression models with measurement error in covariates
    Cui, Jingyu
    Yi, Grace Y.
    JOURNAL OF MULTIVARIATE ANALYSIS, 2024, 202
  • [2] Variable selection in multivariate linear regression with random predictors
    Mbina, Alban Mbina
    Nkiet, Guy Martial
    N'guessan, Assi
    SOUTH AFRICAN STATISTICAL JOURNAL, 2023, 57 (01) : 27 - 44
  • [3] Model selection in principal covariates regression
    Vervloet, Marlies
    Van Deun, Katrijn
    Van den Noortgate, Wim
    Ceulemans, Eva
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2016, 151 : 26 - 33
  • [4] Variable selection in regression with compositional covariates
    Lin, Wei
    Shi, Pixu
    Feng, Rui
    Li, Hongzhe
    BIOMETRIKA, 2014, 101 (04) : 785 - 797
  • [5] ON VARIABLE SELECTION IN MULTIVARIATE REGRESSION
    SPARKS, RS
    ZUCCHINI, W
    COUTSOURIDES, D
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1985, 14 (07) : 1569 - 1587
  • [6] Variable selection with missing data in both covariates and outcomes: Imputation and machine learning
    Hu, Liangyuan
    Lin, Jung-Yi Joyce
    Ji, Jiayi
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (12) : 2651 - 2671
  • [7] On the selection of the weighting parameter value in Principal Covariates Regression
    Vervloet, Marlies
    Van Deun, Katrijn
    Van den Noortgate, Wim
    Ceulemans, Eva
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2013, 123 : 36 - 43
  • [8] Improving sample and feature selection with principal covariates regression
    Cersonsky, Rose K.
    Helfrecht, Benjamin A.
    Engel, Edgar A.
    Kliavinek, Sergei
    Ceriotti, Michele
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2021, 2 (03):
  • [9] VARIABLE SELECTION IN NONPARAMETRIC REGRESSION WITH CONTINUOUS COVARIATES
    ZHANG, P
    ANNALS OF STATISTICS, 1991, 19 (04): : 1869 - 1882
  • [10] VARIABLE SELECTION IN NONPARAMETRIC REGRESSION WITH CATEGORICAL COVARIATES
    BICKEL, P
    PING, Z
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (417) : 90 - 97