PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling

被引:26
作者
Zhou, Yujia [2 ]
Dou, Zhicheng [1 ]
Zhu, Yutao [3 ]
Wen, Ji-Rong [4 ,5 ]
机构
[1] Renmin Univ China, Gaoling Sch Artificial Intelligence, Beijing, Peoples R China
[2] Renmin Univ China, Sch Informat, Beijing, Peoples R China
[3] Univ Montreal, Montreal, PQ, Canada
[4] Beijing Key Lab Big Data Management & Anal Method, Beijing, Peoples R China
[5] MOE, Key Lab Data Engn & Knowledge Engn, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021 | 2021年
基金
中国国家自然科学基金;
关键词
Personalized search; Self-supervised learning; Contrastive learning;
D O I
10.1145/3459637.3482379
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Personalized search plays a crucial role in improving user search experience owing to its ability to build user profiles based on historical behaviors. Previous studies have made great progress in extracting personal signals from the query log and learning user representations. However, neural personalized search is extremely dependent on sufficient data to train the user model. Data sparsity is an inevitable challenge for existing methods to learn high-quality user representations. Moreover, the overemphasis on final ranking quality leads to rough data representations and impairs the generalizability of the model. To tackle these issues, we propose a Personalized Search framework with Self-supervised Learning (PSSL) to enhance data representations. Specifically, we adopt a contrastive sampling method to extract paired self-supervised information from sequences of user behaviors in query logs. Four auxiliary tasks are designed to pre-train the sentence encoder and the sequence encoder used in the ranking model. They are optimized by contrastive loss which aims to close the distance between similar user sequences, queries, and documents. Experimental results on two datasets demonstrate that our proposed model PSSL achieves state-of-the-art performance compared with existing baselines.
引用
收藏
页码:2749 / 2758
页数:10
相关论文
共 46 条
[1]  
Ahmad Wasi Uddin, 2018, Multi-task learning for document ranking and query suggestion
[2]  
[Anonymous], 2017, ECIR 2017 SPRING, DOI DOI 10.1007/978-3-319-56608-5_54
[3]  
[Anonymous], 2010, P WWW 2010
[4]  
Bennett PN, 2012, SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P185, DOI 10.1145/2348283.2348312
[5]   Personalized Document Re-ranking Based on Bayesian Probabilistic Matrix Factorization [J].
Cai, Fei ;
Liang, Shangsong ;
de Rijke, Maarten .
SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, :835-838
[6]  
Carman Mark James, 2010, P 19 ACM C INF KNOWL, P1849
[7]  
Chang W-C., 2020, P INT C LEARNING REP
[8]  
Chen Ting, 2020, P 37 INT C MACHINE L, V119, P1597
[9]  
Cronen-Townsend Steve, 2002, P HLT, V2, P94
[10]   Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search [J].
Dai, Zhuyun ;
Xiong, Chenyan ;
Callan, Jamie ;
Liu, Zhiyuan .
WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, :126-134