PanIC: Consistent information criteria for general model selection problems

被引:0
作者
Nguyen, Hien Duy [1 ,2 ]
机构
[1] La Trobe Univ, Sch Comp Engn & Math Sci, Bundoora, Vic, Australia
[2] Kyushu Univ, Inst Math Ind, Fukuoka, Japan
基金
澳大利亚研究理事会;
关键词
finite mixture models; information criteria; least absolute deviation; loss minimisation; model selection; order selection; principal component analysis; support vector regression; MINIMAL PENALTIES; MIXTURE; INFERENCE; NUMBER; ORDER;
D O I
10.1111/anzs.12426
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Model selection is a ubiquitous problem that arises in the application of many statistical and machine learning methods. In the likelihood and related settings, it is typical to use the method of information criteria (ICs) to choose the most parsimonious among competing models by penalizing the likelihood-based objective function. Theorems guaranteeing the consistency of ICs can often be difficult to verify and are often specific and bespoke. We present a set of results that guarantee consistency for a class of ICs, which we call PanIC (from the Greek root 'pan', meaning 'of everything'), with easily verifiable regularity conditions. PanICs are applicable in any loss-based learning problem and are not exclusive to likelihood problems. We illustrate the verification of regularity conditions for model selection problems regarding finite mixture models, least absolute deviation and support vector regression and principal component analysis, and demonstrate the effectiveness of PanICs for such problems via numerical simulations. Furthermore, we present new sufficient conditions for the consistency of BIC-like estimators and provide comparisons of the BIC with PanIC.
引用
收藏
页码:441 / 466
页数:26
相关论文
共 64 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
Ando T., 2010, Bayesian Model Selection and Statistical Modeling
[3]  
Arlot S, 2019, J SFDS, V160, P1
[4]   Estimation of multiple-regime regressions with least absolutes deviation [J].
Bai, JS .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1998, 74 (01) :103-134
[5]   CONSISTENCY OF AIC AND BIC IN ESTIMATING THE NUMBER OF SIGNIFICANT COMPONENTS IN HIGH-DIMENSIONAL PRINCIPAL COMPONENT ANALYSIS [J].
Bai, Zhidong ;
Choi, Kwok Pui ;
Fujikoshi, Yasunori .
ANNALS OF STATISTICS, 2018, 46 (03) :1050-1076
[6]   On rates of convergence for sample average approximations in the almost sure sense and in mean [J].
Banholzer, Dirk ;
Fliege, Jorg ;
Werner, Ralf .
MATHEMATICAL PROGRAMMING, 2022, 191 (01) :307-345
[7]   Risk bounds for model selection via penalization [J].
Barron, A ;
Birgé, L ;
Massart, P .
PROBABILITY THEORY AND RELATED FIELDS, 1999, 113 (03) :301-413
[8]   Fast rates for estimation error and oracle inequalities for model selection [J].
Bartlett, Peter L. .
ECONOMETRIC THEORY, 2008, 24 (02) :545-552
[9]   Estimation and model selection for model-based clustering with the conditional classification likelihood [J].
Baudry, Jean-Patrick .
ELECTRONIC JOURNAL OF STATISTICS, 2015, 9 (01) :1041-1077
[10]   Slope heuristics: overview and implementation [J].
Baudry, Jean-Patrick ;
Maugis, Cathy ;
Michel, Bertrand .
STATISTICS AND COMPUTING, 2012, 22 (02) :455-470