Continuous Top-k Monitoring on Document Streams

被引:9
|
作者
Hou, Leong U. [1 ]
Zhang, Junjie [1 ]
Mouratidis, Kyriakos [2 ]
Li, Ye [1 ]
机构
[1] Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China
[2] Singapore Management Univ, Sch Informat Syst, Singapore 188065, Singapore
关键词
Top-k query; continuous query; document stream; QUERIES;
D O I
10.1109/TKDE.2017.2657622
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The efficient processing of document streams plays an important role in many information filtering systems. Emerging applications, such as news update filtering and social network notifications, demand presenting end-users with the most relevant content to their preferences. In this work, user preferences are indicated by a set of keywords. A central server monitors the document stream and continuously reports to each user the top-k documents that are most relevant to her keywords. Our objective is to support large numbers of users and high stream rates, while refreshing the top-k results almost instantaneously. Our solution abandons the traditional frequency-ordered indexing approach. Instead, it follows an identifier-ordering paradigm that suits better the nature of the problem. When complemented with a novel, locally adaptive technique, our method offers (i) proven optimality w.r.t. the number of considered queries per stream event, and (ii) an order of magnitude shorter response time (i.e., time to refresh the query results) than the current state-of-the-art.
引用
收藏
页码:991 / 1003
页数:13
相关论文
共 50 条
  • [31] A SURVEY ON TOP-K QUERY PROCESSING IN MANETs
    Mohanapriya, T.
    Ranganathan, S. Raja
    Karthik, S.
    PROCEEDINGS OF 2017 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO 2017), 2017, : 480 - 484
  • [32] Finding top-k longest palindromes in substrings
    Mitani, Kazuki
    Mieno, Takuya
    Seto, Kazuhisa
    Horiyama, Takashi
    THEORETICAL COMPUTER SCIENCE, 2023, 979
  • [33] Using a Real-Time Top-k Algorithm to Mine the Most Frequent Items over Multiple Streams
    Wang, Ling
    Qu, Zhao Yang
    Zhou, Tie Hua
    Ryu, Keun Ho
    INTELLIGENT COMPUTING THEORIES, 2013, 7995 : 305 - 314
  • [34] Generic Techniques for Building Top-k Structures
    Rahul, Saladi
    Tao, Yufei
    ACM TRANSACTIONS ON ALGORITHMS, 2022, 18 (04)
  • [35] Efficient top-k recently-frequent term querying over spatio-temporal textual streams
    Dam, Thu-Lan
    Chester, Sean
    Norvag, Kjetil
    Duong, Quang-Huy
    INFORMATION SYSTEMS, 2021, 97
  • [36] Optimal algorithms for selecting top-k combinations of attributes: theory and applications
    Lin, Chunbin
    Lu, Jiaheng
    Wei, Zhewei
    Wang, Jianguo
    Xiao, Xiaokui
    VLDB JOURNAL, 2018, 27 (01): : 27 - 52
  • [37] Indexing and progressive top-k similarity retrieval of trajectories
    Nikolaos Pliakis
    Eleftherios Tiakas
    Yannis Manolopoulos
    World Wide Web, 2021, 24 : 51 - 83
  • [38] Top-k Learned Clauses for Modern SAT Solvers
    Lonlac, Jerry
    Nguifo, Engelbert Mephu
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2023, 32 (01)
  • [39] Semantics and evaluation of top-k queries in probabilistic databases
    Zhang, Xi
    Chomicki, Jan
    DISTRIBUTED AND PARALLEL DATABASES, 2009, 26 (01) : 67 - 126
  • [40] Scalable top-k keyword search in relational databases
    Yanwei Xu
    Cluster Computing, 2019, 22 : 731 - 747