Information Retrieval Using Fuzzy Fingerprints

被引:0
作者
Raposo, Goncalo [1 ,2 ]
Carvalho, Joao Paulo [2 ]
Coheur, Luisa [2 ]
Martins, Bruno [2 ]
机构
[1] Unbabel, Lisbon, Portugal
[2] Univ Lisbon, Inst Super Tecn, INESC ID, Lisbon, Portugal
来源
INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS, IPMU 2024, VOL 1 | 2024年 / 1174卷
关键词
Fuzzy fingerprints; Information retrieval; Embeddings; Transformer-based models;
D O I
10.1007/978-3-031-74003-9_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fuzzy fingerprints, derived from language model embeddings, have shown promise in classification tasks. This paper extends their application to information retrieval, using the well-established MS MARCO dataset. We assess the performance of these fingerprints against dense retrieval methods, particularly focusing on the use of both general and retrieval-optimized encoders, and decreasing the vector sizes. Our findings indicate that while fuzzy fingerprints may slightly underperform compared to dense retrieval, their performance remains comparable, especially with smaller vector sizes. This suggests their potential as a memory efficient retrieval method, while also showcasing the significant data representation capabilities inherent in the positions of embeddings.
引用
收藏
页码:99 / 112
页数:14
相关论文
共 25 条
[1]  
Allen C, 2019, PR MACH LEARN RES, V97
[2]  
Brown TB, 2020, ADV NEUR IN, V33
[3]  
Bubeck S, 2023, Arxiv, DOI [arXiv:2303.12712, 10.48550/arXiv.2303.12712, 10.48550/ARXIV.2303.12712]
[4]  
Dalton J., 2020, 29 TEXT RETRIEVAL C
[5]  
Daniel W.W., 1990, Duxbury Classic Series
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]   Principal component analysis [J].
Greenacre, Michael ;
Groenen, Patrick J. F. ;
Hastie, Trevor ;
D'Enza, Alfonso Lodice ;
Markos, Angelos ;
Tuzhilina, Elena .
NATURE REVIEWS METHODS PRIMERS, 2022, 2 (01)
[8]   A Deep Look into neural ranking models for information retrieval [J].
Guo, Jiafeng ;
Fan, Yixing ;
Pang, Liang ;
Yang, Liu ;
Ai, Qingyao ;
Zamani, Hamed ;
Wu, Chen ;
Croft, W. Bruce ;
Cheng, Xueqi .
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
[9]  
Homem N, 2011, ANN M N AM FUZZ INF, P1, DOI DOI 10.1109/NAFIPS.2011.5751998
[10]   View-independent representation with frame interpolation method for skeleton-based human action recognition [J].
Jiang, Yingguo ;
Xu, Jun ;
Zhang, Tong .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (12) :2625-2636