Query Sub-intent Mining by Incorporating Search Results with Query Logs for Information Retrieval

被引:0
作者
Liu, Xinyu [1 ]
机构
[1] Monash Univ, Fac Informat Technol, Melbourne, Vic, Australia
来源
2023 IEEE 8TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS, ICBDA | 2023年
关键词
query sub-intent mining; natural language processing; pre-trained language model; information retrieval;
D O I
10.1109/ICBDA57405.2023.10104948
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining query sub-intents or sub-topics is one of the important task in information retrieval. It provides the user several potential queries to explore possible search intents of the user. With the development of Big Data and Natural Language Processing, pre-trained language models have been applied to model complex semantic information of different text resources for mining robust query sub-intents. These studies usually utilize search results and query logs independently as two important resources to generate query sub-intents. However, we deem that the contextual information contained in search results and user interest information contained in query logs can be incorporated together to enhance the effectiveness of user sub-intent mining, which can maximize the best of both resources. To generate high-quality sub-intents, we design a sequence-to-sequence pre-trained language model which accepts search result texts and query suggestions extracted from query logs as the input, and outputs generated sub-intent phrases. For modeling the relation between search results and query logs, we design two information encoder and a novel attention mechanism at the decoder part. At each decoding step, the model weights the attention between the input search results and query logs to determine the output token. The experimental results on MIMICS dataset outperform strong baseline methods in almost all evaluation metrics, illustrating the effectiveness of our proposed methods. We also conduct removing studies to prove the effectiveness of search results and query logs individually, and then study and compare different generation paradigms of sub-intent with experiments. We finally show several generated examples to illustrate the quality of our generated sub-intents directly.
引用
收藏
页码:180 / 186
页数:7
相关论文
共 35 条
[1]  
Ahmad W. U., 2018, 6 INT C LEARNING REP
[2]  
[Anonymous], 2007, P 16 INT C WORLD WID, DOI [DOI 10.1145/1242572.1242651, 10.1145/1242572.1242651]
[3]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[4]  
Dou Z., 2011, P 20 ACM INT C INFOR
[5]   Automatically Mining Facets for Queries from Their Search Results [J].
Dou, Zhicheng ;
Jiang, Zhengbao ;
Hu, Sha ;
Wen, Ji-Rong ;
Song, Ruihua .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (02) :385-397
[6]  
Hao Chen, 2000, CHI 2000 Conference Proceedings. Conference on Human Factors in Computing Systems. CHI 2000. The Future is Here, P145, DOI 10.1145/332040.332418
[7]  
Hashemi H., 2021, P 30 ACM INT C INFOR
[8]  
He Y., 2021, IEEE XPLORE
[9]  
Hu YH, 2012, SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P305, DOI 10.1145/2348283.2348327
[10]  
Imani Ayyoob, 2019, Advances in Information Retrieval. 41st European Conference on IR Research, ECIR 2019. Proceedings: Lecture Notes in Computer Science (LNCS 11438), P203, DOI 10.1007/978-3-030-15719-7_26