An Experimental and Theoretical Comparison of Model Selection Methods

被引：0

作者：

Michael Kearns

Yishay Mansour

Andrew Y. Ng

Dana Ron

机构：

[1] AT&T Laboratories Research,Department of Computer Science

[2] Tel Aviv University,Department of Computer Science

[3] Carnegie Mellon University,Laboratory of Computer Science

[4] MIT,undefined

来源：

Machine Learning | 1997年 / 27卷

关键词：

model selection; complexity regularization; cross validation; minimum description length principle; structural risk minimization; vc dimension;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We investigate the problem of model selection in the setting of supervised learning of boolean functions from independent random examples. More precisely, we compare methods for finding a balance between the complexity of the hypothesis chosen and its observed error on a random training sample of limited size, when the goal is that of minimizing the resulting generalization error. We undertake a detailed comparison of three well-known model selection methods — a variation of Vapnik's Guaranteed Risk Minimization (GRM), an instance of Rissanen's Minimum Description Length Principle (MDL), and (hold-out) cross validation (CV). We introduce a general class of model selection methods (called penalty-based methods) that includes both GRM and MDL, and provide general methods for analyzing such rules. We provide both controlled experimental evidence and formal theorems to support the following conclusions:

引用

页码：7 / 50

页数：43