Reducing bias and mitigating the influence of excess of zeros in regression covariates with multi-outcome adaptive LAD-lasso

被引：1

作者：

Mottonen, Jyrki ^{[1
]}

Lahderanta, Tero ^{[2
]}

Salonen, Janne ^{[3
]}

Sillanpaa, Mikko J. ^{[2
]}

机构：

[1] Univ Helsinki, Dept Math & Stat, Helsinki, Finland

[2] Univ Oulu, Res Unit Math Sci, Oulu, Finland

[3] Finnish Publ Sect Pens Provider Keva, Helsinki, Finland

来源：

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS | 2024年 / 53卷 / 13期

关键词：

Multivariate analysis; p >> n regression; penalized regression; robust procedures; variable selection; zero-inflated continuous data; NONCONCAVE PENALIZED LIKELIHOOD; VARIABLE SELECTION; SHRINKAGE; MODELS;

D O I：

10.1080/03610926.2023.2189059

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Zero-inflated explanatory variables, as opposed to outcome variables, are common, for example, in environmental sciences. In this article, we address the problem of having excess of zero values in some continuous explanatory variables, which are subject to multi-outcome lasso-regularized variable selection. In short, the problem results from the failure of the lasso-type of shrinkage methods to recognize any difference between zero value occurring either in the regression coefficient or in the corresponding value of the explanatory variable. This kind of confounding will obviously increase the number of false positives - all non-zero regression coefficients do not necessarily represent true outcome effects. We present here the adaptive LAD-lasso for multiple outcomes, which extends the earlier work of multi-outcome LAD-lasso with adaptive penalization. In addition to well-known property of having less biased regression coefficients, we show that the adaptivity also improves method's ability to recover from influences of excess of zero values measured in continuous covariates.

引用

页码：4730 / 4744

页数：15

共 32 条

[1] A survey of cross-validation procedures for model selection
Arlot, Sylvain
Celisse, Alain
[J]. STATISTICS SURVEYS, 2010, 4 : 40 - 79
[2] Weighted LAD-LASSO method for robust parameter estimation and variable selection in regression
Arslan, Olcay
[J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (06) : 1952 - 1965
[3] A tutorial on statistical methods for population association studies
Balding, David J.
[J]. NATURE REVIEWS GENETICS, 2006, 7 (10) : 781 - 791
[4] Elasticities and the Inverse Hyperbolic Sine Transformation
Bellemare, Marc F.
Wichman, Casey J.
[J]. OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 2020, 82 (01) : 50 - 61
[5] Crooks Lucy, 2009, BMC Proc, V3 Suppl 1, pS2
[6] Least angle regression - Rejoinder
Efron, B
Hastie, T
Johnstone, I
Tibshirani, R
[J]. ANNALS OF STATISTICS, 2004, 32 (02) : 494 - 499
[7] Nonconcave penalized likelihood with a diverging number of parameters
Fan, JQ
Peng, H
[J]. ANNALS OF STATISTICS, 2004, 32 (03) : 928 - 961
[8] Variable selection via nonconcave penalized likelihood and its oracle properties
Fan, JQ
Li, RZ
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) : 1348 - 1360
[9] MULTICOLLINEARITY IN REGRESSION ANALYSIS - PROBLEM REVISITED
FARRAR, DE
GLAUBER, RR
[J]. REVIEW OF ECONOMICS AND STATISTICS, 1967, 49 (01) : 92 - 107
[10] Huang J, 2008, STAT SINICA, V18, P1603

← 1 2 3 4 →