Time-Space Lower Bounds for Two-Pass Learning

被引：12

作者：

Garg, Sumegha ^{[1
]}

Raz, Ran ^{[1
]}

Tal, Avishay ^{[1
]}

机构：

[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA

来源：

34TH COMPUTATIONAL COMPLEXITY CONFERENCE (CCC 2019) | 2019年 / 137卷

基金：

美国国家科学基金会;

关键词：

branching program; time-space tradeoffs; two-pass streaming; PAC learning; lower bounds;

D O I：

10.4230/LIPIcs.CCC.2019.22

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A line of recent works showed that for a large class of learning problems, any learning algorithm requires either super-linear memory size or a super-polynomial number of samples [11, 7, 12, 9, 2, 5]. For example, any algorithm for learning parities of size n requires either a memory of size Omega(n(2)) or an exponential number of samples [11]. All these works modeled the learner as a one-pass branching program, allowing only one pass over the stream of samples. In this work, we prove the first memory-samples lower bounds (with a super-linear lower bound on the memory size and super-polynomial lower bound on the number of samples) when the learner is allowed two passes over the stream of samples. For example, we prove that any two-pass algorithm for learning parities of size n requires either a memory of size Omega(n(1.5)) or at least 2(Omega(root n)) samples. More generally, a matrix M : A x X -> {-1, 1} corresponds to the following learning problem: An unknown element x is an element of X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a(1), b(1)), (a(2), b(2)) ... , where for every i, a(i) is an element of A is chosen uniformly at random and b(i) = M(a(i), x). Assume that k, l, r are such that any submatrix of M of at least 2(-k) . vertical bar A vertical bar rows and at least 2(-l) . vertical bar X vertical bar columns, has a bias of at most 2(-r). We show that any two-pass learning algorithm for the learning problem corresponding to M requires either a memory of size at least Omega(k . min{k, root l}), or at least 2(Omega(min{k,root l,r})) samples.

引用

页数：39

共 16 条

[1] BOUNDED-WIDTH POLYNOMIAL-SIZE BRANCHING PROGRAMS RECOGNIZE EXACTLY THOSE LANGUAGES IN NC [J].

BARRINGTON, DA .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1989, 38 (01) :150-164

[2]

Beame Paul, 2018, C LEARN THEOR, P843

[3] UNBIASED BITS FROM SOURCES OF WEAK RANDOMNESS AND PROBABILISTIC COMMUNICATION COMPLEXITY [J].

CHOR, B ;

GOLDREICH, O .

SIAM JOURNAL ON COMPUTING, 1988, 17 (02) :230-261

[4]

Dagan Y., 2018, C LEARN THEOR, V75, P1145

[5] Extractor-Based Time-Space Lower Bounds for Learning [J].

Garg, Sumegha ;

Raz, Ran ;

Tal, Avishay .

STOC'18: PROCEEDINGS OF THE 50TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2018, :990-1002

[6] Time-Space Hardness of Learning Sparse Parities [J].

Kol, Gillat ;

Raz, Ran ;

Tal, Avishay .

STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, :1067-1080

[7]

Kol G, 2013, STOC'13: PROCEEDINGS OF THE 2013 ACM SYMPOSIUM ON THEORY OF COMPUTING, P715

[8]

Moshkovitz Dana, 2017, C LEARN THEOR, P1516

[9]

Moshkovitz Dana, 2018, 9 INN THEOR COMP SCI

[10]

Moshkovitz Michal, 2017, ARXIV170300729

← 1 2 →