VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity

被引:0
作者
Zhang, Qianxi [1 ]
Xu, Shuotao [1 ]
Chen, Qi [1 ]
Sui, Guoxin [1 ]
Xie, Jiadong [1 ,2 ]
Cai, Zhizhen [1 ,3 ]
Chen, Yaoqi [1 ,3 ]
He, Yinxuan [1 ,4 ]
Yang, Yuqing [1 ]
Yang, Fan [1 ]
Yang, Mao [1 ]
Zhou, Lidong [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
[2] East China Normal Univ, Shanghai, Peoples R China
[3] Univ Sci & Technol China, Hefei, Peoples R China
[4] Renmin Univ China, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 17TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, OSDI 2023 | 2023年
关键词
TREES;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Approximate similarity queries on high-dimensional vector indices have become the cornerstone for many critical online services. An increasing need for more sophisticated vector queries requires integrating vector search systems with relational databases. However, high-dimensional vector indices do not exhibit monotonicity, a critical property of conventional indices. The lack of monotonicity forces existing vector systems to rely on monotonicity-preserving tentative indices, set up temporarily for a target vector's TopK nearest neighbors, to facilitate queries. This leads to suboptimal performance due to the difficulty to predict the optimal K. This paper presents VBASE, a system that efficiently supports complex queries of both approximate similarity search and relational operators. VBASE identifies a common property, relaxed monotonicity, to unify two seemingly incompatible systems. This common property allows VBASE to circumvent the constraints of a TopK-only interface to achieve significantly higher efficiency, while provably preserving the semantics of TopK-based solutions. Evaluation results show VBASE offers up to three orders-of-magnitude higher performance than state-of-the-art vector systems on complex online vector queries. VBASE further enables analytical similarity queries that previous vector systems do not, and shows 7,000x speedup with 99.9% accuracy of exact queries.
引用
收藏
页码:377 / 395
页数:19
相关论文
共 90 条
  • [1] Malkov YA, 2018, Arxiv, DOI [arXiv:1603.09320, 10.48550/arXiv.1603.09320]
  • [2] [Anonymous], Benchmarking nearest neighbors
  • [3] [Anonymous], Google multisearch
  • [4] [Anonymous], The TPC-C benchmark
  • [5] [Anonymous], Facebook simsearchnet
  • [6] [Anonymous], Open distro
  • [7] [Anonymous], The TPC-H benchmark
  • [8] [Anonymous], Openai chatgpt retrieval plugin
  • [9] [Anonymous], Billion-scale anns benchmarks
  • [10] [Anonymous], Azure vm fsv2-series