Web-Scale Semantic Product Search with Large Language Models

被引:3
|
作者
Muhamed, Aashiq [1 ]
Srinivasan, Sriram [1 ]
Teo, Choon-Hui [1 ]
Cui, Qingjun [1 ]
Zeng, Belinda [2 ]
Chilimbi, Trishul [2 ]
Vishwanathan, S. V. N. [1 ]
机构
[1] Amazon, Palo Alto, CA 94303 USA
[2] Amazon, Seattle, WA USA
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT III | 2023年 / 13937卷
关键词
Matching; Retrieval; Search; Pretrained Language Models;
D O I
10.1007/978-3-031-33380-4_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dense embedding-based semantic matching is widely used in e-commerce product search to address the shortcomings of lexical matching such as sensitivity to spelling variants. The recent advances in BERT-like language model encoders, have however, not found their way to realtime search due to the strict inference latency requirement imposed on e-commerce websites. While bi-encoder BERT architectures enable fast approximate nearest neighbor search, training them effectively on query-product data remains a challenge due to training instabilities and the persistent generalization gap with cross-encoders. In this work, we propose a four-stage training procedure to leverage large BERT-like models for product search while preserving low inference latency. We introduce query-product interaction pre-finetuning to effectively pretrain BERT bi-encoders for matching and improve generalization. Through offline experiments on an e-commerce product dataset, we show that a distilled small BERT-based model (75M params) trained using our approach improves the search relevance metric by up to 23% over a baseline DSSM-based model with similar inference latency. The small model only suffers a 3% drop in relevance metric compared to the 20x larger teacher. We also show using online A/B tests at scale, that our approach improves over the production model in exact and substitute products retrieved.
引用
收藏
页码:73 / 85
页数:13
相关论文
共 50 条
  • [1] Advancing Large Language Models for Spatiotemporal and Semantic Association Mining of Similar Environmental Events
    Tian, Yuanyuan
    Li, Wenwen
    Hu, Lei
    Chen, Xiao
    Brook, Michael
    Brubaker, Michael
    Zhang, Fan
    Liljedahl, Anna K.
    TRANSACTIONS IN GIS, 2025, 29 (01)
  • [2] Unifying reasoning and search to Web scale
    Fensel, Dieter
    van Harmelen, Frank
    IEEE INTERNET COMPUTING, 2007, 11 (02) : 96 - +
  • [3] Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs
    Shi, Jieming
    Jin, Tianyuan
    Yang, Renchi
    Xiao, Xiaokui
    Yang, Yin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (07): : 966 - 978
  • [4] Measuring Query Complexity in Web-Scale Discovery A Comparison between Two Academic Libraries
    Cohen, Rachael A.
    Pusnik, Angie Thorpe
    REFERENCE & USER SERVICES QUARTERLY, 2018, 57 (04) : 274 - 284
  • [5] Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
    Xiao, Shitao
    Liu, Zheng
    Shao, Yingxia
    Di, Tao
    Middha, Bhuvan
    Wu, Fangzhao
    Xie, Xing
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4215 - 4225
  • [6] The unreasonable effectiveness of large language models in zero-shot semantic annotation of legal texts
    Savelka, Jaromir
    Ashley, Kevin D.
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
  • [7] A survey of emerging applications of large language models for problems in mechanics, product design, and manufacturing
    Mustapha, K. B.
    ADVANCED ENGINEERING INFORMATICS, 2025, 64
  • [8] Natural Language Processing in Large-Scale Neural Models for Medical Screenings
    Stille, Catharina Marie
    Bekolay, Trevor
    Blouw, Peter
    Kroeger, Bernd J.
    FRONTIERS IN ROBOTICS AND AI, 2019, 6
  • [9] Indexing and mining large-scale neuron databases using maximum inner product search
    Li, Zhongyu
    Fang, Ruogu
    Shen, Fumin
    Katouzian, Amin
    Zhang, Shaoting
    PATTERN RECOGNITION, 2017, 63 : 680 - 688
  • [10] Searching the Web for illegal content: the anatomy of a semantic search engine
    Laura, Luigi
    Me, Gianluigi
    SOFT COMPUTING, 2017, 21 (05) : 1245 - 1252