TopX: efficient and versatile top-k query processing for semistructured data

被引:0
|
作者
Martin Theobald
Holger Bast
Debapriyo Majumdar
Ralf Schenkel
Gerhard Weikum
机构
[1] Max-Planck Institute for Informatics,
来源
The VLDB Journal | 2008年 / 17卷
关键词
Efficient XML full-text search; Content- and structure-aware ranking; Top-; query processing; Cost-based index access scheduling; Probabilistic candidate pruning; Dynamic query expansion; DB&IR integration;
D O I
暂无
中图分类号
学科分类号
摘要
Recent IR extensions to XML query languages such as Xpath 1.0 Full-Text or the NEXI query language of the INEX benchmark series reflect the emerging interest in IR-style ranked retrieval over semistructured data. TopX is a top-k retrieval engine for text and semistructured data. It terminates query execution as soon as it can safely determine the k top-ranked result elements according to a monotonic score aggregation function with respect to a multidimensional query. It efficiently supports vague search on both content- and structure-oriented query conditions for dynamic query relaxation with controllable influence on the result ranking. The main contributions of this paper unfold into four main points: (1) fully implemented models and algorithms for ranked XML retrieval with XPath Full-Text functionality, (2) efficient and effective top-k query processing for semistructured data, (3) support for integrating thesauri and ontologies with statistically quantified relationships among concepts, leveraged for word-sense disambiguation and query expansion, and (4) a comprehensive description of the TopX system, with performance experiments on large-scale corpora like TREC Terabyte and INEX Wikipedia.
引用
收藏
页码:81 / 115
页数:34
相关论文
共 50 条
  • [21] Efficient monochromatic and bichromatic probabilistic reverse top-k query processing for uncertain big data
    Xiao, Guoqing
    Li, Kenli
    Zhou, Xu
    Li, Keqin
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2017, 89 : 92 - 113
  • [22] Top-k Query Processing with Conditional Skips
    Bortnikov, Edward
    Carmel, David
    Golan-Gueta, Guy
    WWW'17 COMPANION: PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2017, : 653 - 661
  • [23] A SURVEY ON TOP-K QUERY PROCESSING IN MANETs
    Mohanapriya, T.
    Ranganathan, S. Raja
    Karthik, S.
    PROCEEDINGS OF 2017 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO 2017), 2017, : 480 - 484
  • [24] Top-k query processing in uncertain Databases
    Soliman, Mohamed A.
    Ilyas, Ihab F.
    Chang, Kevin Chen-Chuan
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 871 - +
  • [25] TDEP: efficiently processing top-k dominating query on massive data
    Xixian Han
    Jianzhong Li
    Hong Gao
    Knowledge and Information Systems, 2015, 43 : 689 - 718
  • [26] TDEP: efficiently processing top-k dominating query on massive data
    Han, Xixian
    Li, Jianzhong
    Gao, Hong
    KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 43 (03) : 689 - 718
  • [27] Top-k query processing over uncertain data in distributed environments
    Sun, Yongjiao
    Yuan, Ye
    Wang, Guoren
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2012, 15 (04): : 429 - 446
  • [28] Supporting early pruning in top-k query processing on massive data
    Han, Xixian
    Li, Jianzhong
    Yang, Donghua
    INFORMATION PROCESSING LETTERS, 2011, 111 (11) : 524 - 532
  • [29] Processing Spatial Keyword Query as a Top-k Aggregation Query
    Zhang, Dongxiang
    Chan, Chee-Yong
    Tan, Kian-Lee
    SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 355 - 364
  • [30] Efficient Top-k Data Sources Ranking for Query on Deep Web
    Shen, Derong
    Li, Meifang
    Yu, Ge
    Kou, Yue
    Nie, Tiezheng
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2008, PROCEEDINGS, 2008, 5175 : 321 - 336