Caching Historical Embeddings in Conversational Search

被引：3

作者：

Frieder, Ophir ^{[1
]}

Mele, Ida ^{[2
]}

Muntean, Cristina Ioana ^{[3
]}

Nardini, Franco Maria ^{[3
]}

Perego, Raffaele ^{[3
]}

Tonellotto, Nicola ^{[4
]}

机构：

[1] Georgetown Univ, St Marys Hall 359, Georgetown, DC 20057 USA

[2] IASI CNR, Via Taurini 19, Rome, Lazio, Italy

[3] ISTI CNR, Via Giuseppe Moruzzi 1, Pisa, Tuscany, Italy

[4] Univ Pisa, Via Girolamo Caruso 16, Pisa, Tuscany, Italy

来源：

ACM TRANSACTIONS ON THE WEB | 2024年 / 18卷 / 04期

关键词：

Conversational search; similarity search; caching; dense retrieval;

D O I：

10.1145/3578519

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Rapid response, namely, low latency, is fundamental in search applications; it is particularly so in interactive search sessions, such as those encountered in conversational settings. An observation with a potential to reduce latency asserts that conversational queries exhibit a temporal locality in the lists of documents retrieved. Motivated by this observation, we propose and evaluate a client-side document embedding cache, improving the responsiveness of conversational search systems. By leveraging state-of-the-art dense retrieval models to abstract document and query semantics, we cache the embeddings of documents retrieved for a topic introduced in the conversation, as they are likely relevant to successive queries. Our document embedding cache implements an efficient metric index, answering nearest-neighbor similarity queries by estimating the approximate result sets returned. We demonstrate the efficiency achieved using our cache via reproducible experiments based on Text Retrieval Conference Conversational Assistant Track datasets, achieving a hit rate of up to 75% without degrading answer quality. Our achieved high cache hit rates significantly improve the responsiveness of conversational systems while likewise reducing the number of queries managed on the search back-end.

引用

页数：19

共 36 条

[1] Anand Avishek, 2020, Dagstuhl Reports, V9
[2] Speeding Up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces
Bachrach, Yoram
Finkelstein, Yehuda
Gilad-Bachrach, Ran
Katzir, Liran
Koenigstein, Noam
Nice, Nir
Paquet, Ulrich
[J]. PROCEEDINGS OF THE 8TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'14), 2014, : 257 - 264
[3] Similarity Caching
Chierichetti, Flavio
Kumar, Ravi
Vassilvitskii, Sergei
[J]. PODS'09: PROCEEDINGS OF THE TWENTY-EIGHTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2009, : 127 - 135
[4] CAsT-19: A Dataset for Conversational Information Seeking
Dalton, Jeffrey
Xiong, Chenyan
Kumar, Vaibhav
Callan, Jamie
[J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1985 - 1988
[5] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[6] Similarity caching in large-scale image retrieval
Falchi, Fabrizio
Lucchese, Claudio
Orlando, Salvatore
Perego, Raffaele
Rabitti, Fausto
[J]. INFORMATION PROCESSING & MANAGEMENT, 2012, 48 (05) : 803 - 818
[7] Falchi Fabrizio., 2009, EDBT '09, P780, DOI DOI 10.1145/1516360.1516450
[8] Gao JF, 2022, Arxiv, DOI [arXiv:2201.05176, DOI 10.48550/ARXIV.2201.05176]
[9] Embedding-based Retrieval in Facebook Search
Huang, Jui-Ting
Sharma, Ashish
Sun, Shuying
Xia, Li
Zhang, David
Pronin, Philip
Padmanabhan, Janani
Ottaviano, Giuseppe
Yang, Linjun
[J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2553 - 2561
[10] Billion-Scale Similarity Search with GPUs
Johnson, Jeff
Douze, Matthijs
Jegou, Herve
[J]. IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (03) : 535 - 547

← 1 2 3 4 →