Improving Bi-encoder Document Ranking Models with Two Rankers and Multi-teacher Distillation

被引：14

作者：

Choi, Jaekeol ^{[1
,2
]}

Jung, Euna ^{[3
]}

Suh, Jangwon ^{[3
]}

Rhee, Wonjong ^{[4
]}

机构：

[1] Seoul Natl Univ, Seoul, South Korea

[2] Naver Corp, Seongnam Si, South Korea

[3] Seoul Natl Univ, GSCST, Seoul, South Korea

[4] Seoul Natl Univ, GSCST, GSAI, AIIS, Seoul, South Korea

来源：

SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2021年

关键词：

Information retrieval; neural ranking model; bi-encoder; knowledge distillation; multi-teacher distillation;

D O I：

10.1145/3404835.3463076

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

BERT-based Neural Ranking Models (NRMs) can be classified according to how the query and document are encoded through BERT's self-attention layers - bi-encoder versus cross-encoder. Bi-encoder models are highly efficient because all the documents can be pre-processed before the query time, but their performance is inferior compared to cross-encoder models. Both models utilize a ranker that receives BERT representations as the input and generates a relevance score as the output. In this work, we propose a method where multi-teacher distillation is applied to a cross-encoder NRM and a bi-encoder NRM to produce a bi-encoder NRM with two rankers. The resulting student bi-encoder achieves an improved performance by simultaneously learning from a cross-encoder teacher and a bi-encoder teacher and also by combining relevance scores from the two rankers. We call this method TRMD (Two Rankers and Multi-teacher Distillation). In the experiments, TwinBERT and ColBERT are considered as baseline bi-encoders. When monoBERT is used as the cross-encoder teacher, together with either TwinBERT or ColBERT as the bi-encoder teacher, TRMD produces a student bi-encoder that performs better than the corresponding baseline bi-encoder. For P@20, the maximum improvement was 11.4%, and the average improvement was 6.8%. As an additional experiment, we considered producing cross-encoder students with TRMD, and found that it could also improve the cross-encoders.(1)

引用

页码：2192 / 2196

页数：5

共 14 条

[1] Neural Ranking Models with Weak Supervision [J].

Dehghani, Mostafa ;

Zamani, Hamed ;

Severyn, Aliaksei ;

Kamps, Jaap ;

Croft, W. Bruce .

SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, :65-74

[2] Efficient Knowledge Distillation from an Ensemble of Teachers [J].

Fukuda, Takashi ;

Suzuki, Masayuki ;

Kurata, Gakuto ;

Thomas, Samuel ;

Cui, Jia ;

Ramabhadran, Bhuvana .

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, :3697-3701

[3] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[4]

Hinton G., 2015, ARXIV

[5] Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval [J].

Hui, Kai ;

Yates, Andrew ;

Berberich, Klaus ;

de Melo, Gerard .

WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, :279-287

[6]

Humeau S., 2020, Proceedings of the ICLR

[7]

Huston S., 2014, Parameters learned in the comparison of retrieval models using term dependencies

[8] ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT [J].

Khattab, Omar ;

Zaharia, Matei .

PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, :39-48

[9]

King DB, 2015, ACS SYM SER, V1214, P1, DOI 10.1021/bk-2015-1214.ch001

[10] TwinBERT: Distilling Knowledge to Twin-Structured Compressed BERT Models for Large-Scale Retrieval [J].

Lu, Wenhao ;

Jiao, Jian ;

Zhang, Ruofei .

CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, :2645-2652

← 1 2 →