Overtraining, regularization and searching for a minimum, with application to neural networks

被引：99

作者：

Sjoberg, J

Ljung, L

机构：

[1] Department of Electrical Engineering, Linkoping University, Linkoping

来源：

INTERNATIONAL JOURNAL OF CONTROL | 1995年 / 62卷 / 06期

基金：

瑞典研究理事会;

关键词：

D O I：

10.1080/00207179508921605

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper we discuss the role of criterion minimization as a means for parameter estimation. Most traditional methods, such as maximum likelihood and prediction error identification are based on these principles. However, somewhat surprisingly, it turns out that it is not always 'optimal' to try to find the absolute minimum point of the criterion. The reason is that 'stopped minimization' (where the iterations have been terminated before the absolute minimum has been reached) has more or less identical properties as using regularization (adding a parametric penalty term). Regularization is known to have beneficial effects on the variance of the parameter estimates and it reduces the 'variance contribution' of the misfit. This also explains the concept of 'overtraining' in neural nets. How does one know when to terminate the iterations then? A useful criterion would be to stop iterations when the criterion function applied to a validation data set no longer decreases. However, in this paper, we show that applying this technique extensively may lead to the fact that the resulting estimate is an unregularized estimate for the total data set: estimation + validation data.

引用

页码：1391 / 1407

页数：17

共 13 条

[1]

DENNIS JE, 1983, NUMERICAL METHODS UN

[2] RIDGE REGRESSION AND JAMES-STEIN ESTIMATION - REVIEW AND COMMENTS [J].