GS2P: a generative pre-trained learning to rank model with over-parameterization for web-scale search

被引:7
作者
Li, Yuchen [1 ]
Xiong, Haoyi [2 ]
Kong, Linghe [1 ]
Bian, Jiang [2 ]
Wang, Shuaiqiang [2 ]
Chen, Guihai [1 ]
Yin, Dawei [2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Baidu Inc, Beijing, Peoples R China
关键词
Learning to rank; Data reconstruction; Pre-training; Web search;
D O I
10.1007/s10994-023-06469-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While learning to rank (LTR) is widely employed in web searches to prioritize pertinent webpages from the retrieved contents based on input queries, traditional LTR models stumble over two principal stumbling blocks leading to subpar performance: 1) the lack of well-annotated query-webpage pairs with ranking scores to cover search queries of various popularity, debilitating their coverage of search queries across the popularity spectrum, and 2) ill-trained models that are incapable of inducing generalized representations for LTR, culminating in overfitting. To tackle the above challenges, we proposed a Generative Semi-Supervised Pre-trained (GS(2)P) Learning to Rank model. Specifically, GS(2)P first generates pseudo-labels for the unlabeled samples using tree-based LTR models after a series of co-training procedures, then learns the representations of query-webpage pairs with self-attentive transformers via both discriminative (LTR) and generative (denoising autoencoding for reconstruction) losses. Finally, GS(2)P boosts the performance of LTR through incorporating Random Fourier Features to over-parameterize the models into "interpolating regime", so as to enjoy the further descent of generalization errors with learned representations. We conduct extensive offline experiments on a publicly available dataset and a real-world dataset collected from a large-scale search engine. The results show that GS(2)P can achieve the best performance on both datasets, compared to baselines. We also deploy GS(2)P at a large-scale web search engine with realistic traffic, where we can still observe significant improvement in real-world applications. GS(2)P performs consistently in both online and offline experiments.
引用
收藏
页码:5331 / 5349
页数:19
相关论文
共 38 条
[31]   Unbiased Learning to Rank in Feeds Recommendation [J].
Wu, Xinwei ;
Chen, Hechang ;
Zhao, Jiashu ;
He, Li ;
Yin, Dawei ;
Chang, Yi .
WSDM '21: PROCEEDINGS OF THE 14TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2021, :490-498
[32]  
Xia Fen, 2008, P 25 INT C MACHINE L, P1192, DOI DOI 10.1145/1390156.1390306
[33]   Revisiting Two-tower Models for Unbiased Learning to Rank [J].
Yan, Le ;
Qin, Zhen ;
Zhuang, Honglei ;
Wang, Xuanhui ;
Bendersky, Michael ;
Najork, Marc .
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, :2410-2414
[34]   AUC Maximization in the Era of Big Data and AI: A Survey [J].
Yang, Tianbao ;
Ying, Yiming .
ACM COMPUTING SURVEYS, 2023, 55 (08)
[35]   Training query filtering for semi-supervised learning to rank with pseudo labels [J].
Zhang, Xin ;
He, Ben ;
Luo, Tiejian .
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2016, 19 (05) :833-864
[36]  
Zhao S., 2010, P 23 INT C COMPUTATI, P1317
[37]  
Zhao Shiqi, 2011, P 5 INT JOINT C NAT, P929
[38]   Pre-trained Language Model based Ranking in Baidu Search [J].
Zou, Lixin ;
Zhang, Shengqiang ;
Cai, Hengyi ;
Ma, Dehong ;
Cheng, Suqi ;
Wang, Shuaiqiang ;
Shi, Daiting ;
Cheng, Zhicong ;
Yin, Dawei .
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, :4014-4022