Web-Scale Semantic Product Search with Large Language Models

被引：4

作者：

Muhamed, Aashiq ^{[1
]}

Srinivasan, Sriram ^{[1
]}

Teo, Choon-Hui ^{[1
]}

Cui, Qingjun ^{[1
]}

Zeng, Belinda ^{[2
]}

Chilimbi, Trishul ^{[2
]}

Vishwanathan, S. V. N. ^{[1
]}

机构：

[1] Amazon, Palo Alto, CA 94303 USA

[2] Amazon, Seattle, WA USA

来源：

ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT III | 2023年 / 13937卷

关键词：

Matching; Retrieval; Search; Pretrained Language Models;

D O I：

10.1007/978-3-031-33380-4_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dense embedding-based semantic matching is widely used in e-commerce product search to address the shortcomings of lexical matching such as sensitivity to spelling variants. The recent advances in BERT-like language model encoders, have however, not found their way to realtime search due to the strict inference latency requirement imposed on e-commerce websites. While bi-encoder BERT architectures enable fast approximate nearest neighbor search, training them effectively on query-product data remains a challenge due to training instabilities and the persistent generalization gap with cross-encoders. In this work, we propose a four-stage training procedure to leverage large BERT-like models for product search while preserving low inference latency. We introduce query-product interaction pre-finetuning to effectively pretrain BERT bi-encoders for matching and improve generalization. Through offline experiments on an e-commerce product dataset, we show that a distilled small BERT-based model (75M params) trained using our approach improves the search relevance metric by up to 23% over a baseline DSSM-based model with similar inference latency. The small model only suffers a 3% drop in relevance metric compared to the 20x larger teacher. We also show using online A/B tests at scale, that our approach improves over the production model in exact and substitute products retrieved.

引用

页码：73 / 85

页数：13

共 22 条

[1]

[Anonymous], 2016, ICLR

[2]

Chen T, 2020, PR MACH LEARN RES, V119

[3]

Conneau A., 2020, Unsupervised Cross-lingual Representation Learning at Scale, P8440, DOI [10.18653/v1/2020.acl-main.747, DOI 10.18653/V1/2020.ACL-MAIN.747, DOI 10.48550/ARXIV.1911.02116]

[4]

Devlin Jacob, 2019, NACCL

[5]

Hofstatter Sebastian, 2020, arXiv

[6] Embedding-based Retrieval in Facebook Search [J].

Huang, Jui-Ting ;

Sharma, Ashish ;

Sun, Shuying ;

Xia, Li ;

Zhang, David ;

Pronin, Philip ;

Padmanabhan, Janani ;

Ottaviano, Giuseppe ;

Yang, Linjun .

KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :2553-2561

[7]

Huang PS, 2013, PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), P2333

[8]

Karpukhin V, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P6769

[9] ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT [J].

Khattab, Omar ;

Zaharia, Matei .

PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, :39-48

[10]

Kudo T, 2018, CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P66

← 1 2 3 →