A Directed Isoperimetric Inequality with application to Bregman Near Neighbor Lower Bounds

被引:8
作者
Abdullah, Amirali [1 ]
Venkatasubramanian, Suresh [1 ]
机构
[1] Univ Utah, Salt Lake City, UT 84112 USA
来源
STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING | 2015年
基金
美国国家科学基金会;
关键词
Bregman divergences; isoperimetric inequalities; near neighbors; lower bounds; Fourier analysis; BOOLEAN FUNCTIONS; SENSITIVITY;
D O I
10.1145/2746539.2746595
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Bregman divergences are important distance measures that are used in applications such as computer vision, text mining, and speech processing, and are a focus of interest in machine learning due to their information-theoretic properties. There has been extensive study of algorithms for clustering and near neighbor search with respect to these divergences. In all cases, the guarantees depend not just on the data size n and dimensionality d, but also on a structure constant mu >= 1 that depends solely on a generating convex function phi and can grow without bound independently. In general, this mu parametrizes the degree to which a given divergence is "asymmetric". In this paper, we provide the first evidence that this dependence on mu might be intrinsic. We focus on the problem of approximate nearest neighbor (ANN) search for Bregman divergences. We show that under the cell probe model, any non-adaptive data structure (like locality-sensitive hashing) for c-approximate near-neighbor search that admits r probes must use space Omega(dn(1+mu/cr)) In contrast for LSH under l(1) the best bound is Omega(dn(1+1/cr)). Our results interpolate between known lower bounds both for LSH-based ANN under l(1) as well as the generally harder partial match (Partial Match) problem (in non-adaptive settings). The bounds match the former when mu is small and the latter when mu is Omega(d/log n). This further strengthens the intuition that Partial Match corresponds to an "asymmetric" version of ANN, as well as opening up the possibility of a new line of attack for lower bounds on Partial Match. Our new tool is a directed variant of the standard boolean noise operator. We prove a generalization of the BonamiBeckner hypercontractivity inequality (restricted to certain subsets of the Hamming cube), and use this to prove the desired directed isoperimetric inequality that we use in our data structure lower bound.
引用
收藏
页码:509 / 517
页数:9
相关论文
共 34 条
[1]  
ABDULLAH A., 2012, P 28 ANN S COMP GEOM, P31
[2]   Clustering for Metric and Nonmetric Distance Measures [J].
Ackermann, Marcel R. ;
Bloemer, Johannes ;
Sohler, Christian .
ACM TRANSACTIONS ON ALGORITHMS, 2010, 6 (04)
[3]  
Ackermann MR, 2010, LECT NOTES COMPUT SC, V6139, P212, DOI 10.1007/978-3-642-13731-0_21
[4]  
Ackermann MR, 2009, PROCEEDINGS OF THE TWENTIETH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1088
[5]  
Ahlberg D., 2011, ARXIV11080310V2
[6]  
Amari S.I., 2000, Methods of information geometry, V191
[7]  
Banerjee A, 2005, J MACH LEARN RES, V6, P1705
[8]  
Benjamini I, 1999, PUBL MATH, P5
[9]   Bregman Voronoi Diagrams [J].
Boissonnat, Jean-Daniel ;
Nielsen, Frank ;
Nock, Richard .
DISCRETE & COMPUTATIONAL GEOMETRY, 2010, 44 (02) :281-307
[10]  
Borodin A., 1999, Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, P312, DOI 10.1145/301250.301330