HARDNESS OF LEARNING HALFSPACES WITH NOISE

被引:45
作者
Guruswami, Venkatesan [1 ]
Raghavendra, Prasad [1 ]
机构
[1] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
关键词
hardness of approximation; perceptron; computational learning; ALGORITHM;
D O I
10.1137/070685798
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Learning an unknown halfspace (also called a perceptron) from labeled examples is one of the classic problems in machine learning. In the noise-free case, when a halfspace consistent with all the training examples exists, the problem can be solved in polynomial time using linear programming. However, under the promise that a halfspace consistent with a fraction (1 - epsilon) of the examples exists (for some small constant epsilon > 0), it was not known how to efficiently find a halfspace that is correct on even 51% of the examples. Nor was a hardness result that ruled out getting agreement on more than 99.9% of the examples known. In this work, we close this gap in our understanding and prove that even a tiny amount of worst-case noise makes the problem of learning halfspaces intractable in a strong sense. Specifically, for arbitrary epsilon, delta > 0, we prove that given a set of examples-label pairs from the hypercube, a fraction (1 - epsilon) of which can be explained by a halfspace, it is NP-hard to find a halfspace that correctly labels a fraction (1/2 + delta) of the examples. The hardness result is tight since it is trivial to get agreement on 1/2 the examples. In learning theory parlance, we prove that weak proper agnostic learning of halfspaces is hard. This settles a question that was raised by Blum et al., in their work on learning halfspaces in the presence of random classification noise [Algorithmica, 22 (1998), pp. 35-52], and raised by authors of some more recent works as well. Along the way, we also obtain a strong hardness result for another basic computational problem: solving a linear system over the rationals.
引用
收藏
页码:742 / 765
页数:24
相关论文
共 28 条
[1]   THE RELAXATION METHOD FOR LINEAR INEQUALITIES [J].
AGMON, S .
CANADIAN JOURNAL OF MATHEMATICS-JOURNAL CANADIEN DE MATHEMATIQUES, 1954, 6 (03) :382-392
[2]   The space complexity of approximating the frequency moments [J].
Alon, N ;
Matias, Y ;
Szegedy, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1999, 58 (01) :137-147
[3]   A FAST AND SIMPLE RANDOMIZED PARALLEL ALGORITHM FOR THE MAXIMAL INDEPENDENT SET PROBLEM [J].
ALON, N ;
BABAI, L ;
ITAI, A .
JOURNAL OF ALGORITHMS, 1986, 7 (04) :567-583
[4]  
Alon Noga, 1992, The Probabilistic Method
[5]   On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems [J].
Amaldi, E ;
Kann, V .
THEORETICAL COMPUTER SCIENCE, 1998, 209 (1-2) :237-260
[6]  
ANDERSON I, 2002, COMBINATORICS FINITE
[7]   The hardness of approximate optima in lattices, codes, and systems of linear equations [J].
Arora, S ;
Babai, L ;
Stern, J ;
Sweedyk, Z .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 54 (02) :317-331
[8]   Proof verification and the hardness of approximation problems [J].
Arora, S ;
Lund, C ;
Motwani, R ;
Sudan, M ;
Szegedy, M .
JOURNAL OF THE ACM, 1998, 45 (03) :501-555
[9]   On the difficulty of approximately maximizing agreements [J].
Ben-David, S ;
Eiron, N ;
Long, PM .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (03) :496-514
[10]   A polynomial-time algorithm for learning noisy linear threshold functions [J].
Blum, A ;
Frieze, A ;
Kannan, R ;
Vempala, S .
ALGORITHMICA, 1998, 22 (1-2) :35-52