A novel cell partition method by introducing Silhouette Coefficient for fast approximate nearest neighbor search

被引:12
作者
Song, Wenwen [1 ]
Wang, Yang [2 ]
Pan, Zhibin [1 ,3 ]
机构
[1] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Xian, Peoples R China
[2] Xian Univ Technol, Dept Informat Sci, Xian, Peoples R China
[3] Xi An Jiao Tong Univ, Res Inst, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Approximate nearest neighbor search; Silhouette Coefficient; Cell partition; Adaptive sampling rate; PRODUCT QUANTIZATION; ALGORITHM;
D O I
10.1016/j.ins.2023.119216
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a novel cell partition method is proposed to accelerate the approximate nearest neighbor (ANN) search. The traditional cell-level elimination of database vectors, such as the inverted file system (IVF), has limitations on accelerating the search process. Therefore, the Silhouette Coefficient is introduced to realize more accurate and effective vector-level pruning. The value of the Silhouette Coefficient is used as a novel criterion for dividing the Voronoi cell into the center region and the border region. In addition, a dynamic threshold selection model is introduced to determine the sampling rate for each cell adaptively. The proposed method can greatly improve the search speed by reducing the vector-level distance calculation. Besides, the value of the Silhouette Coefficient is saved by offline generating a rank in each Voronoi cell, which does not bring any extra memory cost. Through experiments, it can be seen that for two widely-used datasets of SIFT1M and GIST1M, with no more than 1% accuracy reduction, the search speed can be significantly increased by about 44% compared with the standard inverted file system. And compared with state-of-the-art ANN search methods that aim at acceleration, the proposed method can also achieve higher search accuracy with almost the same computational consumption.
引用
收藏
页数:16
相关论文
共 40 条
[1]   Approximate Nearest Neighbor Search Using Enhanced Accumulative Quantization [J].
Ai, Liefu ;
Cheng, Hongjun ;
Wang, Xiaoxiao ;
Chen, Chunsheng ;
Liu, Deyang ;
Zheng, Xin ;
Wang, Yuanzhi .
ELECTRONICS, 2022, 11 (14)
[2]   Quarter-Point Product Quantization for approximate nearest neighbor search [J].
An, Shan ;
Huang, Zhibiao ;
Bai, Shuang ;
Che, Guangfu ;
Ma, Xin ;
Luo, Jie ;
Chen, Yu .
PATTERN RECOGNITION LETTERS, 2019, 125 :187-194
[3]   The Inverted Multi-Index [J].
Babenko, Artem ;
Lempitsky, Victor .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (06) :1247-1260
[4]   Additive Quantization for Extreme Vector Compression [J].
Babenko, Artem ;
Lempitsky, Victor .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :931-938
[5]   Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors [J].
Baranchuk, Dmitry ;
Babenko, Artem ;
Malkov, Yury .
COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 :209-224
[6]   A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search [J].
Cai, Deng .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (06) :2337-2348
[7]   Approximate Nearest Neighbor Search by Residual Vector Quantization [J].
Chen, Yongjian ;
Guan, Tao ;
Wang, Cheng .
SENSORS, 2010, 10 (12) :11259-11273
[8]  
Fan Bin, 2019, PATTERN RECOGNIT, V99
[9]   Codebook-softened product quantization for high accuracy approximate nearest neighbor search [J].
Fan, Jingya ;
Pan, Zhibin ;
Wang, Liangzhuang ;
Wang, Yang .
NEUROCOMPUTING, 2022, 507 :107-116
[10]  
Gan J., 2012, SIGMOD, P541, DOI DOI 10.1145/2213836.2213898