High-dimensional penalty selection via minimum description length principle

被引:0
作者
Kohei Miyaguchi
Kenji Yamanishi
机构
[1] The University of Tokyo,
来源
Machine Learning | 2018年 / 107卷
关键词
Minimum description length principle; Luckiness normalized maximum likelihood; Regularized empirical risk minimization; Penalty selection; Concave–convex procedure;
D O I
暂无
中图分类号
学科分类号
摘要
We tackle the problem of penalty selection for regularization on the basis of the minimum description length (MDL) principle. In particular, we consider that the design space of the penalty function is high-dimensional. In this situation, the luckiness-normalized-maximum-likelihood (LNML)-minimization approach is favorable, because LNML quantifies the goodness of regularized models with any forms of penalty functions in view of the MDL principle, and guides us to a good penalty function through the high-dimensional space. However, the minimization of LNML entails two major challenges: (1) the computation of the normalizing factor of LNML and (2) its minimization in high-dimensional spaces. In this paper, we present a novel regularization selection method (MDL-RS), in which a tight upper bound of LNML (uLNML) is minimized with local convergence guarantee. Our main contribution is the derivation of uLNML, which is a uniform-gap upper bound of LNML in an analytic expression. This solves the above challenges in an approximate manner because it allows us to accurately approximate LNML and then efficiently minimize it. The experimental results show that MDL-RS improves the generalization performance of regularized estimates specifically when the model has redundant parameters.
引用
收藏
页码:1283 / 1302
页数:19
相关论文
共 28 条
[1]  
Akaike H(1974)A new look at the statistical model identification IEEE Transactions on Automatic Control 19 716-723
[2]  
Barron AR(1991)Minimum complexity density estimation IEEE Transactions on Information Theory 37 1034-1054
[3]  
Cover TM(2008)Extended bayesian information criteria for model selection with large model spaces Biometrika 95 759-771
[4]  
Chen J(2011)Minimum description length penalization for group and multi-task sparse learning Journal of Machine Learning Research 12 525-564
[5]  
Chen Z(2008)Sparse inverse covariance estimation with the graphical lasso Biostatistics 9 432-441
[6]  
Dhillon PS(2015)A novel machine learning model for estimation of sale prices of real estate units Journal of Construction Engineering and Management 142 04015,066-471
[7]  
Foster D(1978)Modeling by shortest data description Automatica 14 465-47
[8]  
Ungar LH(1996)Fisher information and stochastic complexity IEEE Transactions on Information Theory 42 40-3360
[9]  
Friedman J(2009)Mdl denoising revisited IEEE Transactions on Signal Processing 57 3347-464
[10]  
Hastie T(1978)Estimating the dimension of a model The Annals of Statistics 6 461-17