Query-Oriented Data Augmentation for Session Search

被引:0
作者
Chen, Haonan [1 ]
Dou, Zhicheng [1 ]
Zhu, Yutao [1 ]
Wen, Ji-Rong [1 ]
机构
[1] Renmin Univ China, Engn Res Ctr Next Generat Intelligent Search & Rec, Gaoling Sch Artificia lIntelligence, Minist Educ, Beijing 100872, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Context modeling; Data models; Data augmentation; Training data; Search problems; Task analysis; Query-oriented data augmentation; session search; document ranking;
D O I
10.1109/TKDE.2024.3419131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modeling contextual information in a search session has drawn more and more attention when understanding complex user intents. Recent methods are all data-driven, i.e., they train different models on large-scale search log data to identify the relevance between search contexts and candidate documents. The common training paradigm is to pair the search context with different candidate documents and train the model to rank the clicked documents higher than the unclicked ones. However, this paradigm neglects the symmetric nature of the relevance between the session context and document, i.e., the clicked documents can also be paired with different search contexts when training. In this work, we propose query-oriented data augmentation to enrich search logs and empower the modeling. We generate supplemental training pairs by altering the most important part of a search context, i.e., the current query, and train our model to rank the generated sequence along with the original sequence. This approach enables models to learn that the relevance of a document may vary as the session context changes, leading to a better understanding of users' search patterns. We develop several strategies to alter the current query, resulting in new training data with varying degrees of difficulty. Through experimentation on two extensive public search logs, we have successfully demonstrated the effectiveness of our model.
引用
收藏
页码:6877 / 6888
页数:12
相关论文
共 49 条
  • [1] Ahmad W. U., 2018, P 6 INT C LEARN REPR, P1
  • [2] Context Attentive Document Ranking and Query Suggestion
    Ahmad, Wasi Uddin
    Chang, Kai-Wei
    Wang, Hongning
    [J]. PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 385 - 394
  • [3] Data augmentation via context similarity: An application to biomedical Named Entity Recognition
    Bartolini, Ilaria
    Moscato, Vincenzo
    Postiglione, Marco
    Sperli, Giancarlo
    Vignali, Andrea
    [J]. INFORMATION SYSTEMS, 2023, 119
  • [4] COSINER: COntext SImilarity data augmentation for Named Entity Recognition
    Bartolini, Ilaria
    Moscato, Vincenzo
    Postiglione, Marco
    Sperli, Giancarlo
    Vignali, Andrea
    [J]. SIMILARITY SEARCH AND APPLICATIONS (SISAP 2022), 2022, 13590 : 11 - 24
  • [5] Bennett PN, 2012, SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P185, DOI 10.1145/2348283.2348312
  • [6] Bowman Samuel R., 2016, P 20 SIGNLL C COMPUT, P10, DOI [DOI 10.18653/V1/K16-1002, 10.18653/v1/K16-1002]
  • [7] Enhancing User Behavior Sequence Modeling by Generative Tasks for Session Search
    Chen, Haonan
    Dou, Zhicheng
    Zhu, Yutao
    Cao, Zhao
    Cheng, Xiaohua
    Wen, Ji-Rong
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 180 - 190
  • [8] Integrating Representation and Interaction for Context-Aware Document Ranking
    Chen, Haonan
    Dou, Zhicheng
    Zhu, Qiannan
    Zuo, Xiaochen
    Wen, Ji-Rong
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (01)
  • [9] TianGong-ST: A New Dataset with Large-scale Refined Real-world Web Search Sessions
    Chen, Jia
    Mao, Jiaxin
    Liu, Yiqun
    Zhang, Min
    Ma, Shaoping
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2485 - 2488
  • [10] Counterfactual Critic Multi-Agent Training for Scene Graph Generation
    Chen, Long
    Zhang, Hanwang
    Xiao, Jun
    He, Xiangnan
    Pu, Shiliang
    Chang, Shih-Fu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4612 - 4622