STATISTICAL-THEORY OF LEARNING-CURVES UNDER ENTROPIC LOSS CRITERION

被引：90

作者：

AMARI, S

MURATA, N

机构：

来源：

NEURAL COMPUTATION | 1993年 / 5卷 / 01期

关键词：

D O I：

10.1162/neco.1993.5.1.140

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The present paper elucidates a universal property of learning curves, which shows how the generalization error, training error, and the complexity of the underlying stochastic machine are related and how the behavior of a stochastic machine is improved as the number of training examples increases. The error is measured by the entropic loss. It is proved that the generalization error converges to H0, the entropy of the conditional distribution of the true machine, as H0 + m*/(2t), while the training error converges as H0 - m*/(2t), where t is the number of examples and m* shows the complexity of the network. When the model is faithful, implying that the true machine is in the model, m* is reduced to m, the number of modifiable parameters. This is a universal law because it holds for any regular machine irrespective of its structure under the maximum likelihood estimator. Similar relations are obtained for the Bayes and Gibbs learning algorithms. These learning curves show the relation among the accuracy of learning, the complexity of a model, and the number of training examples.

引用

页码：140 / 153

页数：14

共 25 条

[1] NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].

AKAIKE, H .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723

[2] 4 TYPES OF LEARNING-CURVES [J].

AMARI, S ;

FUJITA, N ;

SHINOMOTO, S .

NEURAL COMPUTATION, 1992, 4 (04) :605-618

[3] A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].

AMARI, S .

IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+

[4] DUALISTIC GEOMETRY OF THE MANIFOLD OF HIGHER-ORDER NEURONS [J].

AMARI, S .

NEURAL NETWORKS, 1991, 4 (04) :443-451

[5]

AMARI S, 1992, METR9203 U TOK

[6]

AMARI S, 1985, SPRINGER LECTURE NOT, V28

[7] What Size Net Gives Valid Generalization? [J].

Baum, Eric B. ;

Haussler, David .

NEURAL COMPUTATION, 1989, 1 (01) :151-160

[8]

Gyorgyi G., 1990, NEURAL NETWORKS SPIN, P3, DOI [10.1142/0938, DOI 10.1142/0938]

[9] LEARNING FROM EXAMPLES IN A SINGLE-LAYER NEURAL NETWORK [J].

HANSEL, D ;

SOMPOLINSKY, H .

EUROPHYSICS LETTERS, 1990, 11 (07) :687-692

[10] LEARNING-PROCESSES IN NEURAL NETWORKS [J].

HESKES, TM ;

KAPPEN, B .

PHYSICAL REVIEW A, 1991, 44 (04) :2718-2726

← 1 2 3 →