NEURAL NET ALGORITHMS THAT LEARN IN POLYNOMIAL-TIME FROM EXAMPLES AND QUERIES

被引:82
作者
BAUM, EB
机构
[1] NEC Research Institute, Princeton, NJ 08540
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1991年 / 2卷 / 01期
关键词
D O I
10.1109/72.80287
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Back propagation and many other neural net algorithms use a data base of labeled examples to train neural networks. Here examples are pairs (x, t(x)), where x is an input vector and t(x) is the value of the target function for input x. We propose an algorithm which trains networks using examples and queries. In a query [1] the algorithm supplies a y and is told t(y) by an oracle. Queries appear to be available in practice for most problems of interest, e.g. by appeal to a human expert. Our algorithm is proved to PAC learn in polynomial time the class of target functions defined by layered, depth 2, threshold nets having n inputs connected to k hidden threshold units connected to one or more output units, provided k less-than-or-equal-to 4. While target functions and input distributions can be described for which the algorithm will fail for larger k, it appears likely to work well "in practice." Tests of a variant of the algorithm have consistently and rapidly learned random nets of this type with n = 200 and k = 200 for examples chosen from a uniform distribution. Alternatively, the algorithm can learn the parity function in n = 200 dimensions to an accuracy of 99.8% in 30 min of SGI workstation CPU time. The algorithm is quite efficient, training a net no larger than the target net, using only O (nk) queries and approximately O (k-epsilon-1) examples, and using time, arguably near optimal, expected to be O (kn3 + nk2-epsilon-1 + k2-epsilon-3). For comparison note that an algorithm training a net of this size using only examples would need OMEGA(kn-epsilon-1) examples, and to perform one forward pass would take time OMEGA(k2n2-epsilon-1). The algorithm can also be proved to learn intersections of k half-spaces in R(n) in time polynomial in both n and k. A variant of the algorithm can learn arbitrary depth layered threshold networks with n inputs and k units in the first hidden layer in time polynomial in the larger of n and k but exponential in the smaller of the two.
引用
收藏
页码:5 / 19
页数:15
相关论文
共 38 条
[1]  
ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
[2]  
Angluin D., 1988, Machine Learning, V2, P319, DOI 10.1023/A:1022821128753
[3]   FAST PROBABILISTIC ALGORITHMS FOR HAMILTONIAN CIRCUITS AND MATCHINGS [J].
ANGLUIN, D ;
VALIANT, LG .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1979, 18 (02) :155-193
[4]  
Baum E. B., 1990, Journal of Complexity, V6, P67, DOI 10.1016/0885-064X(90)90012-3
[5]  
BAUM EB, 1990 EURASIP WORKSH, P2
[6]  
BAUM EB, IN PRESS NEURAL COMP
[7]   What Size Net Gives Valid Generalization? [J].
Baum, Eric B. ;
Haussler, David .
NEURAL COMPUTATION, 1989, 1 (01) :151-160
[8]   The Perceptron Algorithm is Fast for Nonmalicious Distributions [J].
Baum, Eric B. .
NEURAL COMPUTATION, 1990, 2 (02) :248-260
[9]  
BECKER S, 1988, 1988 P CONN MOD SUMM
[10]  
Blum A., 1989, ADV NEURAL INFORMATI, P494