LEARNING A DECISION BOUNDARY FROM STOCHASTIC EXAMPLES - INCREMENTAL ALGORITHMS WITH AND WITHOUT QUERIES

被引:4
作者
KABASHIMA, Y [1 ]
SHINOMOTO, S [1 ]
机构
[1] KYOTO UNIV,DEPT PHYS,KYOTO 606,JAPAN
关键词
D O I
10.1162/neco.1995.7.1.158
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Even if it is not possible to reproduce a target input-output relation, a learning machine should be able to minimize the probability of making errors. A practical learning algorithm should also be simple enough to go without memorizing example data, if possible. Incremental algorithms such as error backpropagation satisfy this requirement. We propose incremental algorithms that provide fast convergence of the machine parameter theta to its optimal choice theta(o) with respect to the number of examples t. We will consider the binary choice model whose target relation has a blurred boundary and the machine whose parameter theta specifies a decision boundary to make the output prediction. The question we wish to address here is how fast theta can approach theta(o), depending upon whether in the learning stage the machine can specify inputs as queries to the target relation, or the inputs are drawn from a certain distribution. If queries are permitted, the machine can achieve the fastest convergence, (theta - theta(o))(2) similar to O(t(-1)). If not, O(t(-1)) convergence is generally not attainable. For learning without queries, we showed in a previous paper that the error minimum algorithm exhibits a slow convergence (theta - theta(o))(2) similar to O(t(-2/3)). We propose here a practical algorithm that provides a rather fast convergence, O(t(-4/5)). It is possible to further accelerate the convergence by using more elaborate algorithms. The fastest convergence turned out to be O[(ln t)(2)t(-1)]. This scaling is considered optimal among possible algorithms, and is not due to the incremental nature of our algorithm.
引用
收藏
页码:158 / 172
页数:15
相关论文
共 25 条
[1]   4 TYPES OF LEARNING-CURVES [J].
AMARI, S ;
FUJITA, N ;
SHINOMOTO, S .
NEURAL COMPUTATION, 1992, 4 (04) :605-618
[2]  
AMARI S, 1967, IEEE T EC, V16, P290
[3]   SCALING LAWS IN LEARNING OF CLASSIFICATION TASKS [J].
BARKAI, N ;
SEUNG, HS ;
SOMPOLINSKY, H .
PHYSICAL REVIEW LETTERS, 1993, 70 (20) :3167-3170
[4]   MINIMUM COMPLEXITY DENSITY-ESTIMATION [J].
BARRON, AR ;
COVER, TM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1991, 37 (04) :1034-1054
[5]   NEURAL NET ALGORITHMS THAT LEARN IN POLYNOMIAL-TIME FROM EXAMPLES AND QUERIES [J].
BAUM, EB .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (01) :5-19
[6]   What Size Net Gives Valid Generalization? [J].
Baum, Eric B. ;
Haussler, David .
NEURAL COMPUTATION, 1989, 1 (01) :151-160
[7]  
Burgers J. M., 1974, NONLINEAR DIFFUSION
[8]  
CRAMER H, 1946, MATH MODELS STATISTI
[9]  
FREUND Y, 1992, IN PRESS P NIPS 92
[10]  
HAUSSLER D, 1991, UCSCCRL9102