TopX: efficient and versatile top-k query processing for semistructured data

被引:0
|
作者
Martin Theobald
Holger Bast
Debapriyo Majumdar
Ralf Schenkel
Gerhard Weikum
机构
[1] Max-Planck Institute for Informatics,
来源
The VLDB Journal | 2008年 / 17卷
关键词
Efficient XML full-text search; Content- and structure-aware ranking; Top-; query processing; Cost-based index access scheduling; Probabilistic candidate pruning; Dynamic query expansion; DB&IR integration;
D O I
暂无
中图分类号
学科分类号
摘要
Recent IR extensions to XML query languages such as Xpath 1.0 Full-Text or the NEXI query language of the INEX benchmark series reflect the emerging interest in IR-style ranked retrieval over semistructured data. TopX is a top-k retrieval engine for text and semistructured data. It terminates query execution as soon as it can safely determine the k top-ranked result elements according to a monotonic score aggregation function with respect to a multidimensional query. It efficiently supports vague search on both content- and structure-oriented query conditions for dynamic query relaxation with controllable influence on the result ranking. The main contributions of this paper unfold into four main points: (1) fully implemented models and algorithms for ranked XML retrieval with XPath Full-Text functionality, (2) efficient and effective top-k query processing for semistructured data, (3) support for integrating thesauri and ontologies with statistically quantified relationships among concepts, leveraged for word-sense disambiguation and query expansion, and (4) a comprehensive description of the TopX system, with performance experiments on large-scale corpora like TREC Terabyte and INEX Wikipedia.
引用
收藏
页码:81 / 115
页数:34
相关论文
共 50 条
  • [1] TopX:: efficient and versatile top-k query processing for semistructured data
    Theobald, Martin
    Bast, Holger
    Majumdar, Debapriyo
    Schenkel, Ralf
    Weikum, Gerhard
    VLDB JOURNAL, 2008, 17 (01): : 81 - 115
  • [2] Efficient and Secure Top-k Query Processing on Hybrid Sensed Data
    Wu, Haiqin
    Wang, Liangmin
    MOBILE INFORMATION SYSTEMS, 2016, 2016
  • [3] TKEP: An efficient top-k query processing algorithm on massive data
    Han X.-X.
    Yang D.-H.
    Li J.-Z.
    Jisuanji Xuebao/Chinese Journal of Computers, 2010, 33 (08): : 1405 - 1417
  • [4] Efficient Distributed Top-k Query Processing with Caching
    Ryeng, Norvald H.
    Vlachou, Akrivi
    Doulkeridis, Christos
    Norvag, Kjetil
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT II, 2011, 6588 : 280 - 295
  • [5] Effective and efficient top-k query processing over incomplete data streams
    Ren, Weilong
    Lian, Xiang
    Ghazinour, Kambiz
    INFORMATION SCIENCES, 2021, 544 : 343 - 371
  • [6] Efficient top-k query evaluation on probabilistic data
    Re, Christopher
    Dalvi, Nilesh
    Suciu, Dan
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 861 - +
  • [7] Best position algorithms for efficient top-k query processing
    Akbarinia, Reza
    Pacitti, Esther
    Valduriez, Patrick
    INFORMATION SYSTEMS, 2011, 36 (06) : 973 - 989
  • [8] Efficient Group Top-k Spatial Keyword Query Processing
    Yao, Kai
    Li, Jianjun
    Li, Guohui
    Luo, Changyin
    WEB TECHNOLOGIES AND APPLICATIONS, PT I, 2016, 9931 : 153 - 165
  • [9] Efficient Top-K Query Processing on Massively Parallel Hardware
    Shanbhag, Anil
    Pirk, Holger
    Madden, Samuel
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1557 - 1570
  • [10] Crowdsourcing for Top-K Query Processing over Uncertain Data
    Ciceri, Eleonora
    Fraternali, Piero
    Martinenghi, Davide
    Tagliasacchi, Marco
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (01) : 41 - 53