TopX: efficient and versatile top-k query processing for semistructured data

被引:0
|
作者
Martin Theobald
Holger Bast
Debapriyo Majumdar
Ralf Schenkel
Gerhard Weikum
机构
[1] Max-Planck Institute for Informatics,
来源
The VLDB Journal | 2008年 / 17卷
关键词
Efficient XML full-text search; Content- and structure-aware ranking; Top-; query processing; Cost-based index access scheduling; Probabilistic candidate pruning; Dynamic query expansion; DB&IR integration;
D O I
暂无
中图分类号
学科分类号
摘要
Recent IR extensions to XML query languages such as Xpath 1.0 Full-Text or the NEXI query language of the INEX benchmark series reflect the emerging interest in IR-style ranked retrieval over semistructured data. TopX is a top-k retrieval engine for text and semistructured data. It terminates query execution as soon as it can safely determine the k top-ranked result elements according to a monotonic score aggregation function with respect to a multidimensional query. It efficiently supports vague search on both content- and structure-oriented query conditions for dynamic query relaxation with controllable influence on the result ranking. The main contributions of this paper unfold into four main points: (1) fully implemented models and algorithms for ranked XML retrieval with XPath Full-Text functionality, (2) efficient and effective top-k query processing for semistructured data, (3) support for integrating thesauri and ontologies with statistically quantified relationships among concepts, leveraged for word-sense disambiguation and query expansion, and (4) a comprehensive description of the TopX system, with performance experiments on large-scale corpora like TREC Terabyte and INEX Wikipedia.
引用
收藏
页码:81 / 115
页数:34
相关论文
共 50 条
  • [31] Efficient top-k query processing in P2P network
    He, YJ
    Shu, YF
    Wang, S
    Du, XY
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, 3180 : 381 - 390
  • [32] Efficient Top-k Cloud Services Query Processing Using Trust and QoS
    Benouaret, Karim
    Benouaret, Idir
    Barhamgi, Mahmoud
    Benslimane, Djamal
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2018, PT I, 2018, 11029 : 203 - 217
  • [33] Efficient non-blocking top-k query processing in distributed networks
    Deng, Bo
    Jia, Yan
    Yang, Shuqiang
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2006, 3882 : 880 - 889
  • [34] Towards an Efficient Top-K Trajectory Similarity Query Processing Algorithm for Big Trajectory Data on GPGPUs
    Leal, Eleazar
    Gruenwald, Le
    Zhang, Jianting
    You, Simin
    2016 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2016, 2016, : 206 - 213
  • [35] Uncertain top-k query processing in distributed environments
    Wang, Xite
    Shen, Derong
    Yu, Ge
    DISTRIBUTED AND PARALLEL DATABASES, 2016, 34 (04) : 567 - 589
  • [36] Joint Top-K Spatial Keyword Query Processing
    Wu, Dingming
    Yiu, Man Lung
    Cong, Gao
    Jensen, Christian S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (10) : 1889 - 1903
  • [37] Distributed Top-k Query Processing on Multi-dimensional Data with Keywords
    Amagata, Daichi
    Hara, Takahiro
    Nishio, Shojiro
    PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2015,
  • [38] Examining the Additivity of Top-k Query Processing Innovations
    Mackenzie, Joel
    Moffat, Alistair
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1085 - 1094
  • [39] Uncertain top-k query processing in distributed environments
    Xite Wang
    Derong Shen
    Ge Yu
    Distributed and Parallel Databases, 2016, 34 : 567 - 589
  • [40] Top-k query processing for replicated data in mobile peer to peer networks
    Sasaki, Yuya
    Hara, Takahiro
    Nishio, Shojiro
    JOURNAL OF SYSTEMS AND SOFTWARE, 2014, 92 : 45 - 58