A scalable solution to the nearest neighbor search problem through local-search methods on neighbor graphs

被引:0
作者
Eric S. Tellez
Guillermo Ruiz
Edgar Chavez
Mario Graff
机构
[1] CONACyT-INFOTEC Centro de Investigación e Innovación en Tecnologías de la Información y Comunicación,
[2] CONACyT-CentroGEO Centro de Investigación en Geografía y Geomática “Ing. Jorge L. Tamayo”,undefined
[3] A.C.,undefined
[4] Centro de Investigación Científica y de Educación Superior de Ensenada,undefined
来源
Pattern Analysis and Applications | 2021年 / 24卷
关键词
Nearest neighbor search; Metric search with heuristics; Combinatorial optimization;
D O I
暂无
中图分类号
学科分类号
摘要
Nearest neighbor search is a powerful abstraction for data access; however, data indexing is troublesome even for approximate indexes. For intrinsically high-dimensional data, high-quality fast searches demand either indexes with impractically large memory usage or preprocessing time. In this paper, we introduce an algorithm to solve a nearest-neighbor query q by minimizing a kernel function defined by the distance from q to each object in the database. The minimization is performed using metaheuristics to solve the problem rapidly; even when some methods in the literature use this strategy behind the scenes, our approach is the first one using it explicitly. We also provide two approaches to select edges in the graph’s construction stage that limit memory footprint and reduce the number of free parameters simultaneously. We carry out a thorough experimental comparison with state-of-the-art indexes through synthetic and real-world datasets; we found out that our contributions achieve competitive performances regarding speed, accuracy, and memory in almost any of our benchmarks.
引用
收藏
页码:763 / 777
页数:14
相关论文
共 62 条
[1]  
Amato G(2015)A comparison of pivot selection techniques for permutation-based indexing Inf Syst 52 176-188
[2]  
Esuli A(2014)Mi-file: using inverted files for scalable approximate similarity search Multimed Tools Appl 71 1333-1362
[3]  
Falchi F(2008)Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions Commun ACM 51 117-122
[4]  
Amato G(2015)Near neighbor searching with k nearest references Inf Syst 51 43-61
[5]  
Gennaro C(2001)Searching in metric spaces ACM Comput Surv 33 273-321
[6]  
Savino P(2012)Use of permutation prefixes for efficient and scalable approximate similarity search Inf Process Manag 48 889-902
[7]  
Andoni A(2014)Optimized product quantization IEEE Trans Pattern Anal Mach Intell 36 744-755
[8]  
Indyk P(2013)Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval IEEE Trans Pattern Anal Mach Intell 35 2916-2929
[9]  
Chávez E(2015)Spherical hashing: binary code embedding with hyperspheres IEEE Trans Pattern Anal Mach Intell 37 2304-2316
[10]  
Graff M(2015)Rank-based similarity search: reducing the dimensional dependence IEEE Trans Pattern Anal Mach Intell 37 136-150