General bounds on the number of examples needed for learning probabilistic concepts

被引:23
作者
Simon, HU
机构
[1] Universität Dortmund, Fachbereich Informatik
关键词
D O I
10.1006/jcss.1996.0019
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Given a p-concept class C, we define two important functions d(c)(gamma), d(c)'(gamma) (related to the notion of gamma-shattering). We prove a lower bound of Omega((d(c)(gamma)-1)/(epsilon gamma 2)) on the number of examples required for learning C with an (epsilon,gamma)-good model of probability. We prove similar lower bounds for some other learning models like learning with gamma-bounded absolute (or quadratic) difference or]earning with a gamma-good decision rule, For the class ND of nondecreasing p-concepts on the real domain, d(ND)(gamma) = Omega(1/gamma) It can be shown that the resulting lower bounds for learning ND (within the models in consideration) are tight to within a logarithmic factor. In order to get the ''almost-matching'' upper bounds, we introduce a new method for designing learning algorithms: dynamic partitioning of the domain by use of splitting trees. The paper also contains a discussion of the gaps between the general lower bounds and the corresponding general upper bounds, It can be shown that, under very mild conditions, these gaps are quite narrow. (C) 1996 Academic Press, Inc.
引用
收藏
页码:239 / 254
页数:16
相关论文
共 16 条
[1]  
ABE N, 1990, 3RD P WORKSH COMP LE, P52
[2]  
ALON N, 1993, AN S FDN CO, P292
[3]  
Angluin D., 1988, Machine Learning, V2, P343, DOI 10.1023/A:1022873112823
[4]  
Ben-David S., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P333, DOI 10.1145/130385.130423
[5]   LEARNABILITY AND THE VAPNIK-CHERVONENKIS DIMENSION [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
JOURNAL OF THE ACM, 1989, 36 (04) :929-965
[6]   A GENERAL LOWER BOUND ON THE NUMBER OF EXAMPLES NEEDED FOR LEARNING [J].
EHRENFEUCHT, A ;
HAUSSLER, D ;
KEARNS, M ;
VALIANT, L .
INFORMATION AND COMPUTATION, 1989, 82 (03) :247-261
[7]  
Feller W., 1968, INTRO PROBABILITY TH
[8]  
Hart P.E., 1973, Pattern recognition and scene analysis
[9]   DECISION THEORETIC GENERALIZATIONS OF THE PAC MODEL FOR NEURAL NET AND OTHER LEARNING APPLICATIONS [J].
HAUSSLER, D .
INFORMATION AND COMPUTATION, 1992, 100 (01) :78-150
[10]  
KEARNS MJ, IN PRESS J COMPUT SY, P382