NEURAL NET ALGORITHMS THAT LEARN IN POLYNOMIAL-TIME FROM EXAMPLES AND QUERIES

被引：82

作者：

BAUM, EB

机构：

[1] NEC Research Institute, Princeton, NJ 08540

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS | 1991年 / 2卷 / 01期

关键词：

D O I：

10.1109/72.80287

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Back propagation and many other neural net algorithms use a data base of labeled examples to train neural networks. Here examples are pairs (x, t(x)), where x is an input vector and t(x) is the value of the target function for input x. We propose an algorithm which trains networks using examples and queries. In a query [1] the algorithm supplies a y and is told t(y) by an oracle. Queries appear to be available in practice for most problems of interest, e.g. by appeal to a human expert. Our algorithm is proved to PAC learn in polynomial time the class of target functions defined by layered, depth 2, threshold nets having n inputs connected to k hidden threshold units connected to one or more output units, provided k less-than-or-equal-to 4. While target functions and input distributions can be described for which the algorithm will fail for larger k, it appears likely to work well "in practice." Tests of a variant of the algorithm have consistently and rapidly learned random nets of this type with n = 200 and k = 200 for examples chosen from a uniform distribution. Alternatively, the algorithm can learn the parity function in n = 200 dimensions to an accuracy of 99.8% in 30 min of SGI workstation CPU time. The algorithm is quite efficient, training a net no larger than the target net, using only O (nk) queries and approximately O (k-epsilon-1) examples, and using time, arguably near optimal, expected to be O (kn3 + nk2-epsilon-1 + k2-epsilon-3). For comparison note that an algorithm training a net of this size using only examples would need OMEGA(kn-epsilon-1) examples, and to perform one forward pass would take time OMEGA(k2n2-epsilon-1). The algorithm can also be proved to learn intersections of k half-spaces in R(n) in time polynomial in both n and k. A variant of the algorithm can learn arbitrary depth layered threshold networks with n inputs and k units in the first hidden layer in time polynomial in the larger of n and k but exponential in the smaller of the two.

引用

页码：5 / 19

页数：15

共 38 条

[1]

ACKLEY DH, 1985, COGNITIVE SCI, V9, P147

[2]

Angluin D., 1988, Machine Learning, V2, P319, DOI 10.1023/A:1022821128753

[3] FAST PROBABILISTIC ALGORITHMS FOR HAMILTONIAN CIRCUITS AND MATCHINGS [J].