Distance Encoded Product Quantization for Approximate K-Nearest Neighbor Search in High-Dimensional Space

被引:19
作者
Heo, Jae-Pil [1 ]
Lin, Zhe [2 ]
Yoon, Sung-Eui [3 ]
机构
[1] Sungkyunkwan Univ, Suwon 16419, South Korea
[2] Adobe Res, San Jose, CA 95110 USA
[3] Korea Adv Inst Sci & Technol, Daejeon 34141, South Korea
关键词
Vector quantization; nearest neighbor search; image retrieval; compact code; high-dimensional search;
D O I
10.1109/TPAMI.2018.2853161
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Approximate K-nearest neighbor search is a fundamental problem in computer science. The problem is especially important for high-dimensional and large-scale data. Recently, many techniques encoding high-dimensional data to compact codes have been proposed. The product quantization and its variations that encode the cluster index in each subspace have been shown to provide impressive accuracy. In this paper, we explore a simple question: is it best to use all the bit-budget for encoding a cluster index? We have found that as data points are located farther away from the cluster centers, the error of estimated distance becomes larger. To address this issue, we propose a novel compact code representation that encodes both the cluster index and quantized distance between a point and its cluster center in each subspace by distributing the bit-budget. We also propose two distance estimators tailored to our representation. We further extend our method to encode global residual distances in the original space. We have evaluated our proposed methods on benchmarks consisting of GIST, VLAD, and CNN features. Our extensive experiments show that the proposed methods significantly and consistently improve the search accuracy over other tested techniques. This result is achieved mainly because our methods accurately estimate distances.
引用
收藏
页码:2084 / 2097
页数:14
相关论文
共 35 条
[1]  
Ai L., 2015, MULTIMEDIA SYST, V23, P169
[2]  
[Anonymous], 2012, VECTOR QUANTIZATION
[3]  
Banerjee A, 2002, SIAM PROC S, P333
[4]  
Blum A., 2010, FDN DATA SCI
[5]   Transform Coding for Fast Approximate Nearest Neighbor Search in High Dimensions [J].
Brandt, Jonathan .
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :1815-1822
[6]  
Cai T, 2013, J MACH LEARN RES, V14, P1837
[7]  
Charikar Moses S, 2002, P 34 ANN ACM S THEOR, P380
[8]   Approximate Nearest Neighbor Search by Residual Vector Quantization [J].
Chen, Yongjian ;
Guan, Tao ;
Wang, Cheng .
SENSORS, 2010, 10 (12) :11259-11273
[9]  
Chiueh Tzi-cker., 1994, VLDB 94, P582
[10]  
Chum O., 2008, P BRIT MACH VIS C, P493