Visualization and assessment of model selection uncertainty

被引:2
作者
Qin, Yichen [1 ]
Wang, Linna [2 ]
Li, Yang [3 ,4 ]
Li, Rong [3 ,4 ]
机构
[1] Univ Cincinnati, Dept Operat Business Analyt & Informat Syst, Cincinnati, OH USA
[2] Univ Cincinnati, Dept Math Sci, Cincinnati, OH USA
[3] Renmin Univ China, Ctr Appl Stat, Beijing, Peoples R China
[4] Renmin Univ China, Sch Stat, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Bootstrap; Model selection deviation; Distribution of the selected model; VARIABLE-SELECTION; LASSO; REGRESSION; REGULARIZATION; LIKELIHOOD; SCAD;
D O I
10.1016/j.csda.2022.107598
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Although model selection is ubiquitous in scientific discovery, the stability and uncertainty of the selected model is often hard to evaluate. How to characterize the random behavior of the model selection procedure is the key to understand and quantify the model selection uncertainty. To this goal, initially several graphical tools are proposed. These include the G -plots and H-plots, to visualize the distribution of the selected model. Then the concept of model selection deviation to quantify the model selection uncertainty is introduced. Similar to the standard error of an estimator, model selection deviation measures the stability of the selected model given by a model selection procedure. For such a measure, a bootstrap estimation procedure is discussed and its desirable performance is demonstrated through simulation studies and real data analysis.(c) 2022 Published by Elsevier B.V.
引用
收藏
页数:20
相关论文
共 43 条
[11]   On the distribution, model selection properties and uniqueness of the Lasso estimator in low and high dimensions [J].
Ewald, Karl ;
Schneider, Ulrike .
ELECTRONIC JOURNAL OF STATISTICS, 2020, 14 (01) :944-969
[12]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[13]  
Fan ZH, 2017, ECONOMET STAT, V1, P167, DOI 10.1016/j.ecosta.2016.08.001
[14]   CONFIDENCE SETS FOR MODEL SELECTION BY F-TESTING [J].
Ferrari, Davide ;
Yang, Yuhong .
STATISTICA SINICA, 2015, 25 (04) :1637-1658
[15]   Regularization Paths for Generalized Linear Models via Coordinate Descent [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Rob .
JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01) :1-22
[16]   The Model Confidence Set [J].
Hansen, Peter R. ;
Lunde, Asger ;
Nason, James M. .
ECONOMETRICA, 2011, 79 (02) :453-497
[17]   Exploration of the variability of variable selection based on distances between bootstrap sample results [J].
Hennig, Christian ;
Sauerbrei, Willi .
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (04) :933-963
[18]  
Knight K, 2000, ANN STAT, V28, P1356
[19]   Transcriptional regulatory networks in Saccharomyces cerevisiae [J].
Lee, TI ;
Rinaldi, NJ ;
Robert, F ;
Odom, DT ;
Bar-Joseph, Z ;
Gerber, GK ;
Hannett, NM ;
Harbison, CT ;
Thompson, CM ;
Simon, I ;
Zeitlinger, J ;
Jennings, EG ;
Murray, HL ;
Gordon, DB ;
Ren, B ;
Wyrick, JJ ;
Tagne, JB ;
Volkert, TL ;
Fraenkel, E ;
Gifford, DK ;
Young, RA .
SCIENCE, 2002, 298 (5594) :799-804
[20]   Integrative interaction analysis using threshold gradient directed regularization [J].
Li, Yang ;
Li, Rong ;
Qin, Yichen ;
Wu, Mengyun ;
Ma, Shuangge .
APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2019, 35 (02) :354-375