HARDNESS OF LEARNING HALFSPACES WITH NOISE

被引：45

作者：

Guruswami, Venkatesan ^{[1
]}

Raghavendra, Prasad ^{[1
]}

机构：

[1] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA

来源：

SIAM JOURNAL ON COMPUTING | 2009年 / 39卷 / 02期

关键词：

hardness of approximation; perceptron; computational learning; ALGORITHM;

D O I：

10.1137/070685798

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Learning an unknown halfspace (also called a perceptron) from labeled examples is one of the classic problems in machine learning. In the noise-free case, when a halfspace consistent with all the training examples exists, the problem can be solved in polynomial time using linear programming. However, under the promise that a halfspace consistent with a fraction (1 - epsilon) of the examples exists (for some small constant epsilon > 0), it was not known how to efficiently find a halfspace that is correct on even 51% of the examples. Nor was a hardness result that ruled out getting agreement on more than 99.9% of the examples known. In this work, we close this gap in our understanding and prove that even a tiny amount of worst-case noise makes the problem of learning halfspaces intractable in a strong sense. Specifically, for arbitrary epsilon, delta > 0, we prove that given a set of examples-label pairs from the hypercube, a fraction (1 - epsilon) of which can be explained by a halfspace, it is NP-hard to find a halfspace that correctly labels a fraction (1/2 + delta) of the examples. The hardness result is tight since it is trivial to get agreement on 1/2 the examples. In learning theory parlance, we prove that weak proper agnostic learning of halfspaces is hard. This settles a question that was raised by Blum et al., in their work on learning halfspaces in the presence of random classification noise [Algorithmica, 22 (1998), pp. 35-52], and raised by authors of some more recent works as well. Along the way, we also obtain a strong hardness result for another basic computational problem: solving a linear system over the rationals.

引用

页码：742 / 765

页数：24

共 28 条

[1] THE RELAXATION METHOD FOR LINEAR INEQUALITIES [J].

AGMON, S .

CANADIAN JOURNAL OF MATHEMATICS-JOURNAL CANADIEN DE MATHEMATIQUES, 1954, 6 (03) :382-392

[2] The space complexity of approximating the frequency moments [J].

Alon, N ;

Matias, Y ;

Szegedy, M .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1999, 58 (01) :137-147

[3] A FAST AND SIMPLE RANDOMIZED PARALLEL ALGORITHM FOR THE MAXIMAL INDEPENDENT SET PROBLEM [J].

ALON, N ;

BABAI, L ;

ITAI, A .

JOURNAL OF ALGORITHMS, 1986, 7 (04) :567-583

[4]

Alon Noga, 1992, The Probabilistic Method

[5] On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems [J].

Amaldi, E ;

Kann, V .

THEORETICAL COMPUTER SCIENCE, 1998, 209 (1-2) :237-260

[6]

ANDERSON I, 2002, COMBINATORICS FINITE

[7] The hardness of approximate optima in lattices, codes, and systems of linear equations [J].

Arora, S ;

Babai, L ;

Stern, J ;

Sweedyk, Z .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 54 (02) :317-331

[8] Proof verification and the hardness of approximation problems [J].

Arora, S ;

Lund, C ;

Motwani, R ;

Sudan, M ;

Szegedy, M .

JOURNAL OF THE ACM, 1998, 45 (03) :501-555

[9] On the difficulty of approximately maximizing agreements [J].

Ben-David, S ;

Eiron, N ;

Long, PM .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (03) :496-514

[10] A polynomial-time algorithm for learning noisy linear threshold functions [J].

Blum, A ;

Frieze, A ;

Kannan, R ;

Vempala, S .

ALGORITHMICA, 1998, 22 (1-2) :35-52

← 1 2 3 →