High-Dimensional LASSO-Based Computational Regression Models: Regularization, Shrinkage, and Selection

被引：100

作者：

Emmert-Streib, Frank ^{[1
,2
,3
]}

Dehmer, Matthias ^{[4
,5
,6
]}

机构：

[1] Tampere Univ, Predict Soc, Tampere 33720, Finland

[2] Tampere Univ, Fac Informat Technol & Commun Sci, Data Analyt Lab, Tampere 33720, Finland

[3] Inst Biosci & Med Technol, Tampere 33520, Finland

[4] Univ Appl Sci Upper Austria, Inst Intelligent Prod, Fac Management, Steyr Campus, A-4400 Steyr, Austria

[5] UMIT, Dept Mechatron & Biomed Comp Sci, A-6060 Hall In Tirol, Austria

[6] Nankai Univ, Coll Comp & Control Engn, Tianjin 300071, Peoples R China

来源：

MACHINE LEARNING AND KNOWLEDGE EXTRACTION | 2019年 / 1卷 / 01期

基金：

奥地利科学基金会;

关键词：

machine learning; statistics; regression models; LASSO; regularization; high-dimensional data; data science; shrinkage; feature selection; VARIABLE SELECTION; DANTZIG SELECTOR; STATISTICAL ESTIMATION; PENALIZED REGRESSION; ADAPTIVE LASSO; EXPRESSION; DESIGN; LARGER;

D O I：

10.3390/make1010021

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Regression models are a form of supervised learning methods that are important for machine learning, statistics, and general data science. Despite the fact that classical ordinary least squares (OLS) regression models have been known for a long time, in recent years there are many new developments that extend this model significantly. Above all, the least absolute shrinkage and selection operator (LASSO) model gained considerable interest. In this paper, we review general regression models with a focus on the LASSO and extensions thereof, including the adaptive LASSO, elastic net, and group LASSO. We discuss the regularization terms responsible for inducing coefficient shrinkage and variable selection leading to improved performance metrics of these regression models. This makes these modern, computational regression models valuable tools for analyzing high-dimensional problems.

引用

页码：359 / 383

页数：25

共 79 条

[1] Class-imbalanced subsampling lasso algorithm for discovering adverse drug reactions [J].