g.ridge: An R Package for Generalized Ridge Regression for Sparse and High-Dimensional Linear Models

被引:4
作者
Emura, Takeshi [1 ,2 ]
Matsumoto, Koutarou [2 ]
Uozumi, Ryuji [3 ]
Michimae, Hirofumi [4 ]
机构
[1] Inst Stat Math, Res Ctr Med & Hlth Data Sci, Tokyo 1908562, Japan
[2] Kurume Univ, Biostat Ctr, Kurume 8300011, Japan
[3] Tokyo Inst Technol, Dept Ind Engn & Econ, Tokyo 1528552, Japan
[4] Kitasato Univ, Sch Pharm, Dept Clin Med Biostat, Tokyo 1088641, Japan
来源
SYMMETRY-BASEL | 2024年 / 16卷 / 02期
关键词
cross-validation; high-dimensional data; intracerebral hemorrhage; least squares estimator; mean square error; penalized regression; R package; shrinkage estimator; sparse model; CROSS-VALIDATION; ESTIMATORS; SELECTION; PREDICTION;
D O I
10.3390/sym16020223
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Ridge regression is one of the most popular shrinkage estimation methods for linear models. Ridge regression effectively estimates regression coefficients in the presence of high-dimensional regressors. Recently, a generalized ridge estimator was suggested that involved generalizing the uniform shrinkage of ridge regression to non-uniform shrinkage; this was shown to perform well in sparse and high-dimensional linear models. In this paper, we introduce our newly developed R package "g.ridge" (first version published on 7 December 2023) that implements both the ridge estimator and generalized ridge estimator. The package is equipped with generalized cross-validation for the automatic estimation of shrinkage parameters. The package also includes a convenient tool for generating a design matrix. By simulations, we test the performance of the R package under sparse and high-dimensional settings with normal and skew-normal error distributions. From the simulation results, we conclude that the generalized ridge estimator is superior to the benchmark ridge estimator based on the R package "glmnet". Hence the generalized ridge estimator may be the most recommended estimator for sparse and high-dimensional models. We demonstrate the package using intracerebral hemorrhage data.
引用
收藏
页数:15
相关论文
共 49 条
[1]   Sine-skewed axial distributions with an application for fallen tree data [J].
Abe, Toshihiro ;
Shimizu, Kunio ;
Kuuluvainen, Timo ;
Aakala, Tuomas .
ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2012, 19 (03) :295-307
[2]   Shrinkage parameter selection via modified cross-validation approach for ridge regression model [J].
Algamal, Zakariya Yahya .
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2020, 49 (07) :1922-1930
[3]   RELATIONSHIP BETWEEN VARIABLE SELECTION AND DATA AUGMENTATION AND A METHOD FOR PREDICTION [J].
ALLEN, DM .
TECHNOMETRICS, 1974, 16 (01) :125-127
[4]   Ridge regression and its applications in genetic studies [J].
Arashi, M. ;
Roozbeh, M. ;
Hamzah, N. A. ;
Gasparini, M. .
PLOS ONE, 2021, 16 (04)
[5]   Diagnosing and correcting the effects of multicollinearity: Bayesian implications of ridge regression [J].
Assaf, A. George ;
Tsionas, Mike ;
Tasiopoulos, Anastasios .
TOURISM MANAGEMENT, 2019, 71 :1-8
[6]  
Azzalini A., 2013, The Skew-Normal and Related Families
[7]   A sparse additive model for high-dimensional interactions with an exposure variable [J].
Bhatnagar, Sahir R. ;
Lu, Tianyuan ;
Lovato, Amanda ;
Olds, David L. ;
Kobor, Michael S. ;
Meaney, Michael J. ;
O'Donnell, Kieran ;
Yang, Archer Y. ;
Greenwood, Celia M. T. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2023, 179
[8]  
Bhattacharjee A, 2022, Big Data Analytics in Oncology with R
[9]   Boosting for high-dimensional time-to-event data with competing risks [J].
Binder, Harald ;
Allignol, Arthur ;
Schumacher, Martin ;
Beyersmann, Jan .
BIOINFORMATICS, 2009, 25 (07) :890-896
[10]   A modified Liu-type estimator with an intercept term under mixture experiments [J].
Chen, Ai-Chun ;
Emura, Takeshi .
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (13) :6645-6667