Trade-off between predictive performance and FDR control for high-dimensional Gaussian model selection

被引：0

作者：

Lacroix, Perrine ^{[1
,2
,3
]}

Martin, Marie-Laure ^{[2
,3
,4
]}

机构：

[1] Univ Paris Saclay, Lab Math Orsay, CNRS, Orsay, France

[2] Univ Paris Saclay, Univ Evry, Inst Plant Sci Paris Saclay IPS2, CNRS, F-91190 Gif Sur Yvette, France

[3] Univ Paris Cite, Inst Plant Sci Paris Saclay IPS2, F-91190 Gif Sur Yvette, France

[4] Univ Paris Saclay, AgroParisTech, INRAE, UMR MIA Paris Saclay, F-91120 Palaiseau, France

来源：

ELECTRONIC JOURNAL OF STATISTICS | 2024年 / 18卷 / 02期

关键词：

and phrases; Ordered variable selection; prediction; FDR; high-dimension; Gaussian regression; hyperparameter calibration; FALSE DISCOVERY RATE; REGRESSION; STABILITY; INFERENCE; SPARSITY;

D O I：

10.1214/24-EJS2260

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

In the context of high-dimensional Gaussian linear regression for ordered variables, we study the variable selection procedure via the minimization of the penalized least-squares criterion. We focus on model selection where the penalty function depends on an unknown multiplicative constant commonly calibrated for prediction. We propose a new proper calibration of this hyperparameter to simultaneously control predictive risk and false discovery rate. We obtain non-asymptotic bounds on the False Discovery Rate with respect to the hyperparameter and we provide an algorithm to calibrate it. This algorithm is based on quantities that can typically be observed in real data applications. The algorithm is validated in an extensive simulation study and is compared with several existing variable selection procedures. Finally, we study an extension of our approach to the case in which an ordering of the variables is not available.

引用

页码：2886 / 2930

页数：45

共 54 条

[11] VALID POST-SELECTION INFERENCE
Berk, Richard
Brown, Lawrence
Buja, Andreas
Zhang, Kai
Zhao, Linda
[J]. ANNALS OF STATISTICS, 2013, 41 (02) : 802 - 837
[12] BICKEL P. J., 2008, Regularized estimation of large covariance matrices
[13] Birge L., 2001, J EUR MATH SOC, V3, P203, DOI DOI 10.1007/S100970100031
[14] Minimal penalties for Gaussian model selection
Birge, Lucien
Massart, Pascal
[J]. PROBABILITY THEORY AND RELATED FIELDS, 2007, 138 (1-2) : 33 - 73
[15] BOGDAN M, 2013, arXiv
[16] Bonferroni C., 1936, Publicazioni Del R Istituto Superiore di Scienze Economische e Commericiali di Firenze, V8, P3
[17] Breiman L., 1984, CLASSIFICATION REGRE, V40, P358, DOI [10.2307/2530946, DOI 10.1002/WIDM.8]
[18] Breiman Leo, 2001, MACH LEARN, V45, P5
[19] Sparsity oracle inequalities for the Lasso
Bunea, Florentina
Tsybakov, Alexandre
Wegkamp, Marten
[J]. ELECTRONIC JOURNAL OF STATISTICS, 2007, 1 : 169 - 194
[20] Aggregation for gaussian regression
Bunea, Florentina
Tsybakov, Alexandre B.
Wegkamp, Marten H.
[J]. ANNALS OF STATISTICS, 2007, 35 (04) : 1674 - 1697

← 1 2 3 4 5 6 →