Modeling concepts and their relationships for corpus-based query auto-completion

被引:1
作者
Rossiello, Gaetano [1 ]
Caputo, Annalina [2 ]
Basile, Pierpaolo [3 ]
Semeraro, Giovanni [3 ]
机构
[1] Thomas J Watson Res Ctr, IBM Res AI, Yorktown Hts, NY 10598 USA
[2] Trinity Coll Dublin, ADAPT Ctr, Sch Comp Sci & Stat, Dublin, Ireland
[3] Univ Bari Aldo Moro, Dept Comp Sci, Bari, Italy
基金
爱尔兰科学基金会; 欧盟地平线“2020”;
关键词
query auto-completion; information retrieval; information extraction; probabilistic graphical model;
D O I
10.1515/comp-2019-0015
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Query auto-completion helps users to formulate their information needs by providing suggestion lists at every typed key. This task is commonly addressed by exploiting query logs and the approaches proposed in the literature fit well in web scale scenarios, where usually huge amounts of past user queries can be analyzed to provide reliable suggestions. However, when query logs are not available, e.g. in enterprise or desktop search engines, these methods are not applicable at all. To face these challenging scenarios, we present a novel corpus-based approach which exploits the textual content of an indexed document collection in order to dynamically generate query completions. Our method extracts informative text fragments from the corpus and it combines them using a probabilistic graphical model in order to capture the relationships between the extracted concepts. Using this approach, it is possible to automatically complete partial queries with significant suggestions related to the keywords already entered by the user without requiring the analysis of the past queries. We evaluate our system through a user study on two different real-world document collections. The experiments show that our method is able to provide meaningful completions outperforming the state-of-the art approach.
引用
收藏
页码:212 / 225
页数:14
相关论文
共 38 条
[1]  
[Anonymous], 2016, CORR
[2]  
BaezaYates R, 2004, LECT NOTES COMPUT SC, V3268, P588
[3]  
Bar-Yossef Ziv, 2011, P 20 INT C WORLD WID, P107, DOI DOI 10.1145/1963405.1963424
[4]  
Barouni-Ebarhimi M, 2007, CNSR 2007: PROCEEDINGS OF THE FIFTH ANNUAL CONFERENCE ON COMMUNICATION NETWORKS AND SERVICES RESEARCH, P125
[5]  
Bast H., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P364, DOI 10.1145/1148170.1148234
[6]  
Bendersky M, 2012, SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P941, DOI 10.1145/2348283.2348408
[7]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[8]  
Bhatia S, 2011, PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), P795
[9]  
Bhatia S, 2010, AAAI CONF ARTIF INTE, P1300
[10]   A Survey of Query Auto Completion in Information Retrieval [J].
Cai, Fei ;
de Rijke, Maarten .
FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, 2016, 10 (04) :274-+