Hash Bit Selection for Nearest Neighbor Search

被引:36
|
作者
Liu, Xianglong [1 ]
He, Junfeng [2 ]
Chang, Shih-Fu [3 ]
机构
[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing 10091, Peoples R China
[2] Google, Mountain View, CA 94043 USA
[3] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
基金
中国国家自然科学基金;
关键词
Nearest neighbor search; hash bit selection; bit reliability; bit complementarity; normalized dominant set; locality-sensitive hashing; QUANTIZATION;
D O I
10.1109/TIP.2017.2695895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To overcome the barrier of storage and computation when dealing with gigantic-scale data sets, compact hashing has been studied extensively to approximate the nearest neighbor search. Despite the recent advances, critical design issues remain open in how to select the right features, hashing algorithms, and/or parameter settings. In this paper, we address these by posing an optimal hash bit selection problem, in which an optimal subset of hash bits are selected from a pool of candidate bits generated by different features, algorithms, or parameters. Inspired by the optimization criteria used in existing hashing algorithms, we adopt the bit reliability and their complementarity as the selection criteria that can be carefully tailored for hashing performance in different tasks. Then, the bit selection solution is discovered by finding the best tradeoff between search accuracy and time using a modified dynamic programming method. To further reduce the computational complexity, we employ the pairwise relationship among hash bits to approximate the highorder independence property, and formulate it as an efficient quadratic programming method that is theoretically equivalent to the normalized dominant set problem in a vertex-and edge-weighted graph. Extensive large-scale experiments have been conducted under several important application scenarios of hash techniques, where our bit selection framework can achieve superior performance over both the naive selection methods and the state-of-the-art hashing algorithms, with significant accuracy gains ranging from 10% to 50%, relatively.
引用
收藏
页码:5367 / 5380
页数:14
相关论文
共 50 条
  • [11] MLSH: Mixed Hash Function Family for Approximate Nearest Neighbor Search in Multiple Fractional Metrics
    Lu, Kejing
    Kudo, Mineichi
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 : 569 - 584
  • [12] Succinct nearest neighbor search
    Tellez, Eric Sadit
    Chavez, Edgar
    Navarro, Gonzalo
    INFORMATION SYSTEMS, 2013, 38 (07) : 1019 - 1030
  • [13] Confirmation Sampling for Exact Nearest Neighbor Search
    Christiani, Tobias
    Pagh, Rasmus
    Thorup, Mikkel
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2020, 2020, 12440 : 97 - 110
  • [14] Revisiting kd-tree for Nearest Neighbor Search
    Ram, Parikshit
    Sinha, Kaushik
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 1378 - 1388
  • [15] A Novel Feature Selection Method for Nearest Neighbor Search in Binary Embedding Codes
    Chiu, Chih-Yi
    Liou, Yu-Cyuan
    2015 24TH WIRELESS AND OPTICAL COMMUNICATION CONFERENCE (WOCC), 2015, : 201 - 205
  • [16] Cloud service selection based on weighted KD tree nearest neighbor search
    Bi, Wenhao
    Ma, Junwen
    Zhu, Xudong
    Wang, Weixiang
    Zhang, An
    APPLIED SOFT COMPUTING, 2022, 131
  • [17] Quality and Efficiency in High Dimensional Nearest Neighbor Search
    Tao, Yufei
    Yi, Ke
    Sheng, Cheng
    Kalnis, Panos
    ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 563 - 575
  • [18] Hardness of Approximate Nearest Neighbor Search
    Rubinstein, Aviad
    STOC'18: PROCEEDINGS OF THE 50TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2018, : 1260 - 1268
  • [19] Projection Search For Approximate Nearest Neighbor
    Feng, Cheng
    Yang, Bo
    2016 17TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2016, : 33 - 38
  • [20] Practical Nearest Neighbor Search in the Plane
    Connor, Michael
    Kumar, Piyush
    EXPERIMENTAL ALGORITHMS, PROCEEDINGS, 2010, 6049 : 501 - 512