Data-Dependent Hashing via Nonlinear Spectral Gaps

被引:15
作者
Andoni, Alexandr [1 ]
Naor, Assaf [2 ]
Nikolov, Aleksandar [3 ]
Razenshteyn, Ilya [4 ]
Waingarten, Erik [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Princeton Univ, Princeton, NJ 08544 USA
[3] Univ Toronto, Toronto, ON, Canada
[4] Microsoft Res Redmond, Redmond, WA USA
来源
STOC'18: PROCEEDINGS OF THE 50TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING | 2018年
基金
加拿大自然科学与工程研究理事会;
关键词
Nearest neighbor search; nonlinear spectral gaps; randomized space partitions; locality-sensitive hashing; NEAREST-NEIGHBOR; APPROXIMATE; EXPANDERS;
D O I
10.1145/3188745.3188846
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We establish a generic reduction from nonlinear spectral gaps of metric spaces to data-dependent Locality-Sensitive Hashing, yielding a new approach to the high-dimensional Approximate Near Neighbor Search problem (ANN) under various distance functions. Using this reduction, we obtain the following results: For general d-dimensional normed spaces and n-point datasets, we obtain a cell-probe ANN data structure with approximation O(log d/epsilon(2)) d(O(1))n1 epsilon, and d(O(1)) n(epsilon) cell probes per query, for any epsilon > 0. No non-trivial approximation was known before in this generality other than the O(root d) bound which follows from embedding a general norm into l(2). For and Schatten-p norms, we improve the data structure further, to obtain approximation 0(p) and sublinear query time. For l(p), this improves upon the previous best approximation 2(O(P)) (which required polynomial as opposed to near-linear in n space). For the Schatten-p norm, no non-trivial ANN data structure was known before this work. Previous approaches to the ANN problem either exploit the low dimensionality of a metric, requiring space exponential in the dimension, or circumvent the curse of dimensionality by embedding a metric into a "tractable" space, such as l(1). Our new generic reduction proceeds differently from both of these approaches using a novel partitioning method.
引用
收藏
页码:787 / 800
页数:14
相关论文
共 49 条
[1]  
Andoni A., 2014, P 25 ANN ACM SIAM S, P1018, DOI DOI 10.1137/1.9781611973402.76
[2]  
Andoni A, 2006, ANN IEEE SYMP FOUND, P459
[3]   Approximate Near Neighbors for General Symmetric Norms [J].
Andoni, Alexandr ;
Nguyen, Huy L. ;
Nikolov, Aleksandar ;
Razenshteyn, Ilya ;
Waingarten, Erik .
STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, :902-913
[4]  
Andoni A, 2017, PROCEEDINGS OF THE TWENTY-EIGHTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P47
[5]   Sketching and Embedding are Equivalent for Norms [J].
Andoni, Alexandr ;
Krauthgamer, Robert ;
Razenshteyn, Ilya .
STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, :479-488
[6]  
Andoni A, 2009, PROCEEDINGS OF THE TWENTIETH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P865
[7]  
Andoni Alexandr, 2016, P 32 INT S COMPUTATI, DOI [10.4230/LIPIcs.SoCG.2016.9, DOI 10.4230/LIPICS.SOCG.2016.9]
[8]  
Andoni Alexandr, 2010, COMMUNICATION
[9]  
Andoni Alexandr, 2018, P ICM 2018
[10]  
Andoni Alexandr, 2009, THESIS