Time-Space Lower Bounds for Two-Pass Learning

被引:12
作者
Garg, Sumegha [1 ]
Raz, Ran [1 ]
Tal, Avishay [1 ]
机构
[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
来源
34TH COMPUTATIONAL COMPLEXITY CONFERENCE (CCC 2019) | 2019年 / 137卷
基金
美国国家科学基金会;
关键词
branching program; time-space tradeoffs; two-pass streaming; PAC learning; lower bounds;
D O I
10.4230/LIPIcs.CCC.2019.22
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A line of recent works showed that for a large class of learning problems, any learning algorithm requires either super-linear memory size or a super-polynomial number of samples [11, 7, 12, 9, 2, 5]. For example, any algorithm for learning parities of size n requires either a memory of size Omega(n(2)) or an exponential number of samples [11]. All these works modeled the learner as a one-pass branching program, allowing only one pass over the stream of samples. In this work, we prove the first memory-samples lower bounds (with a super-linear lower bound on the memory size and super-polynomial lower bound on the number of samples) when the learner is allowed two passes over the stream of samples. For example, we prove that any two-pass algorithm for learning parities of size n requires either a memory of size Omega(n(1.5)) or at least 2(Omega(root n)) samples. More generally, a matrix M : A x X -> {-1, 1} corresponds to the following learning problem: An unknown element x is an element of X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a(1), b(1)), (a(2), b(2)) ... , where for every i, a(i) is an element of A is chosen uniformly at random and b(i) = M(a(i), x). Assume that k, l, r are such that any submatrix of M of at least 2(-k) . vertical bar A vertical bar rows and at least 2(-l) . vertical bar X vertical bar columns, has a bias of at most 2(-r). We show that any two-pass learning algorithm for the learning problem corresponding to M requires either a memory of size at least Omega(k . min{k, root l}), or at least 2(Omega(min{k,root l,r})) samples.
引用
收藏
页数:39
相关论文
共 16 条
[2]  
Beame Paul, 2018, C LEARN THEOR, P843
[3]   UNBIASED BITS FROM SOURCES OF WEAK RANDOMNESS AND PROBABILISTIC COMMUNICATION COMPLEXITY [J].
CHOR, B ;
GOLDREICH, O .
SIAM JOURNAL ON COMPUTING, 1988, 17 (02) :230-261
[4]  
Dagan Y., 2018, C LEARN THEOR, V75, P1145
[5]   Extractor-Based Time-Space Lower Bounds for Learning [J].
Garg, Sumegha ;
Raz, Ran ;
Tal, Avishay .
STOC'18: PROCEEDINGS OF THE 50TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2018, :990-1002
[6]   Time-Space Hardness of Learning Sparse Parities [J].
Kol, Gillat ;
Raz, Ran ;
Tal, Avishay .
STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, :1067-1080
[7]  
Kol G, 2013, STOC'13: PROCEEDINGS OF THE 2013 ACM SYMPOSIUM ON THEORY OF COMPUTING, P715
[8]  
Moshkovitz Dana, 2017, C LEARN THEOR, P1516
[9]  
Moshkovitz Dana, 2018, 9 INN THEOR COMP SCI
[10]  
Moshkovitz Michal, 2017, ARXIV170300729