An Experimental and Theoretical Comparison of Model Selection Methods

被引:0
作者
Michael Kearns
Yishay Mansour
Andrew Y. Ng
Dana Ron
机构
[1] AT&T Laboratories Research,Department of Computer Science
[2] Tel Aviv University,Department of Computer Science
[3] Carnegie Mellon University,Laboratory of Computer Science
[4] MIT,undefined
来源
Machine Learning | 1997年 / 27卷
关键词
model selection; complexity regularization; cross validation; minimum description length principle; structural risk minimization; vc dimension;
D O I
暂无
中图分类号
学科分类号
摘要
We investigate the problem of model selection in the setting of supervised learning of boolean functions from independent random examples. More precisely, we compare methods for finding a balance between the complexity of the hypothesis chosen and its observed error on a random training sample of limited size, when the goal is that of minimizing the resulting generalization error. We undertake a detailed comparison of three well-known model selection methods — a variation of Vapnik's Guaranteed Risk Minimization (GRM), an instance of Rissanen's Minimum Description Length Principle (MDL), and (hold-out) cross validation (CV). We introduce a general class of model selection methods (called penalty-based methods) that includes both GRM and MDL, and provide general methods for analyzing such rules. We provide both controlled experimental evidence and formal theorems to support the following conclusions:
引用
收藏
页码:7 / 50
页数:43
相关论文
empty
未找到相关数据