Caching Historical Embeddings in Conversational Search

被引:3
作者
Frieder, Ophir [1 ]
Mele, Ida [2 ]
Muntean, Cristina Ioana [3 ]
Nardini, Franco Maria [3 ]
Perego, Raffaele [3 ]
Tonellotto, Nicola [4 ]
机构
[1] Georgetown Univ, St Marys Hall 359, Georgetown, DC 20057 USA
[2] IASI CNR, Via Taurini 19, Rome, Lazio, Italy
[3] ISTI CNR, Via Giuseppe Moruzzi 1, Pisa, Tuscany, Italy
[4] Univ Pisa, Via Girolamo Caruso 16, Pisa, Tuscany, Italy
关键词
Conversational search; similarity search; caching; dense retrieval;
D O I
10.1145/3578519
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Rapid response, namely, low latency, is fundamental in search applications; it is particularly so in interactive search sessions, such as those encountered in conversational settings. An observation with a potential to reduce latency asserts that conversational queries exhibit a temporal locality in the lists of documents retrieved. Motivated by this observation, we propose and evaluate a client-side document embedding cache, improving the responsiveness of conversational search systems. By leveraging state-of-the-art dense retrieval models to abstract document and query semantics, we cache the embeddings of documents retrieved for a topic introduced in the conversation, as they are likely relevant to successive queries. Our document embedding cache implements an efficient metric index, answering nearest-neighbor similarity queries by estimating the approximate result sets returned. We demonstrate the efficiency achieved using our cache via reproducible experiments based on Text Retrieval Conference Conversational Assistant Track datasets, achieving a hit rate of up to 75% without degrading answer quality. Our achieved high cache hit rates significantly improve the responsiveness of conversational systems while likewise reducing the number of queries managed on the search back-end.
引用
收藏
页数:19
相关论文
共 36 条
  • [1] Anand Avishek, 2020, Dagstuhl Reports, V9
  • [2] Speeding Up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces
    Bachrach, Yoram
    Finkelstein, Yehuda
    Gilad-Bachrach, Ran
    Katzir, Liran
    Koenigstein, Noam
    Nice, Nir
    Paquet, Ulrich
    [J]. PROCEEDINGS OF THE 8TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'14), 2014, : 257 - 264
  • [3] Similarity Caching
    Chierichetti, Flavio
    Kumar, Ravi
    Vassilvitskii, Sergei
    [J]. PODS'09: PROCEEDINGS OF THE TWENTY-EIGHTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2009, : 127 - 135
  • [4] CAsT-19: A Dataset for Conversational Information Seeking
    Dalton, Jeffrey
    Xiong, Chenyan
    Kumar, Vaibhav
    Callan, Jamie
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1985 - 1988
  • [5] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [6] Similarity caching in large-scale image retrieval
    Falchi, Fabrizio
    Lucchese, Claudio
    Orlando, Salvatore
    Perego, Raffaele
    Rabitti, Fausto
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2012, 48 (05) : 803 - 818
  • [7] Falchi Fabrizio., 2009, EDBT '09, P780, DOI DOI 10.1145/1516360.1516450
  • [8] Gao JF, 2022, Arxiv, DOI [arXiv:2201.05176, DOI 10.48550/ARXIV.2201.05176]
  • [9] Embedding-based Retrieval in Facebook Search
    Huang, Jui-Ting
    Sharma, Ashish
    Sun, Shuying
    Xia, Li
    Zhang, David
    Pronin, Philip
    Padmanabhan, Janani
    Ottaviano, Giuseppe
    Yang, Linjun
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2553 - 2561
  • [10] Billion-Scale Similarity Search with GPUs
    Johnson, Jeff
    Douze, Matthijs
    Jegou, Herve
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (03) : 535 - 547