A Learning-to-Rank Formulation of Clustering-Based Approximate Nearest Neighbor Search

被引:1
|
作者
Vecchiato, Thomas [1 ]
Lucchese, Claudio [1 ]
Nardini, Franco Maria [2 ]
Bruch, Sebastian [3 ]
机构
[1] Ca Foscari Univ Venice, Venice, Italy
[2] ISTI CNR, Pisa, Italy
[3] Pinecone, New York, NY USA
来源
PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024 | 2024年
关键词
Approximate Nearest Neighbor Search; Inverted File; Learning to; Rank; EFFICIENT;
D O I
10.1145/3626772.3657931
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A critical piece of the modern information retrieval puzzle is approximate nearest neighbor search. Its objective is to return a set of k . data points that are closest to a query point, with its accuracy measured by the proportion of exact nearest neighbors captured in the returned set. One popular approach to this question is clustering: The indexing algorithm partitions data points into non-overlapping subsets and represents each partition by a point such as its centroid. The query processing algorithm first identifies the nearest clusters-a process known as routing-then performs a nearest neighbor search over those clusters only. In this work, we make a simple observation: The routing function solves a ranking problem. Its quality can therefore be assessed with a ranking metric, making the function amenable to learning-to-rank. Interestingly, ground-truth is often freely available: Given a query distribution in a top-k. configuration, the ground-truth is the set of clusters that contain the exact top-k. vectors. We develop this insight and apply it to Maximum Inner Product Search (MIPS). As we demonstrate empirically on various datasets, learning a simple linear function consistently improves the accuracy of clustering-based MIPS.
引用
收藏
页码:2261 / 2265
页数:5
相关论文
共 50 条
  • [1] Learning Adaptive Hypersphere: Boosting Efficiency on Approximate Nearest Neighbor Search
    Ai, Liefu
    Jiang, Changyu
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2190 - 2194
  • [2] Quantization to speedup approximate nearest neighbor search
    Hao Peng
    Neural Computing and Applications, 2024, 36 : 2303 - 2313
  • [3] Competitive Quantization for Approximate Nearest Neighbor Search
    Ozan, Ezgi Can
    Kiranyaz, Serkan
    Gabbouj, Moncef
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (11) : 2884 - 2894
  • [4] Quantization to speedup approximate nearest neighbor search
    Peng, Hao
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (05) : 2303 - 2313
  • [5] Learning-based query optimization for multi-probe approximate nearest neighbor search
    Zhang, Pengcheng
    Yao, Bin
    Gao, Chao
    Wu, Bin
    He, Xiao
    Li, Feifei
    Lu, Yuanfei
    Zhan, Chaoqun
    Tang, Feilong
    VLDB JOURNAL, 2023, 32 (03) : 623 - 645
  • [6] Learning-based query optimization for multi-probe approximate nearest neighbor search
    Pengcheng Zhang
    Bin Yao
    Chao Gao
    Bin Wu
    Xiao He
    Feifei Li
    Yuanfei Lu
    Chaoqun Zhan
    Feilong Tang
    The VLDB Journal, 2023, 32 : 623 - 645
  • [7] Tree-based compact hashing for approximate nearest neighbor search
    Hou, Guangdong
    Cui, Runpeng
    Pan, Zheng
    Zhang, Changshui
    NEUROCOMPUTING, 2015, 166 : 271 - 281
  • [8] A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search
    Cai, Deng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (06) : 2337 - 2348
  • [9] Fast spectral analysis for approximate nearest neighbor search
    Jing Wang
    Jie Shen
    Machine Learning, 2022, 111 : 2297 - 2322
  • [10] Scalable Distributed Hashing for Approximate Nearest Neighbor Search
    Cao, Yuan
    Liu, Junwei
    Qi, Heng
    Gui, Jie
    Li, Keqiu
    Ye, Jieping
    Liu, Chao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 472 - 484