Causal Prediction of Top-k Event Types Over Real-Time Event Streams

被引:2
|
作者
Acharya, Saurav [1 ]
Lee, Byung Suk [1 ]
Hines, Paul [2 ]
机构
[1] Univ Vermont, Dept Comp Sci, Burlington, VT 05405 USA
[2] Univ Vermont, Sch Engn, Burlington, VT USA
来源
COMPUTER JOURNAL | 2017年 / 60卷 / 11期
基金
美国国家科学基金会;
关键词
prediction; top-k query; causal network; event stream; BAYESIAN-NETWORK STRUCTURES; CLICKSTREAM DATA; INDEPENDENCE; INFORMATION; INFERENCE; ALGORITHM;
D O I
10.1093/comjnl/bxw098
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the problem of causally predicting the top-k most likely next events over real-time event streams. Existing approaches have limitations-(i) they model causality in an acyclic causal network structure and search it to find the top-k next events, which does not work with real world event streams as they frequently manifest cyclic causality, and (ii) they prune out possible non-causal links from a causal network too aggressively and end up omitting many less frequent yet important causal links. We overcome these limitations using a novel event precedence model (EPM) and a run-time causal inference mechanism. The EPM constructs a Markov chain incrementally over event streams, where an edge between two events signifies a temporal precedence relationship between them, which is a necessary condition for causality. Then, the run-time causal inference mechanism performs causality tests on the EPM during query processing, and temporal precedence relationships that fail the causality test in the presence of other events are removed. Two query processing algorithms are presented. One performs exhaustive search on the model and the other performs more efficient reduced search with early termination. Experiments using two real data sets (cascading blackouts in power systems and web page views) verify efficacy and efficiency of the proposed probabilistic top-k prediction algorithms.
引用
收藏
页码:1561 / 1581
页数:21
相关论文
共 41 条
  • [1] Using a Real-Time Top-k Algorithm to Mine the Most Frequent Items over Multiple Streams
    Wang, Ling
    Qu, Zhao Yang
    Zhou, Tie Hua
    Ryu, Keun Ho
    INTELLIGENT COMPUTING THEORIES, 2013, 7995 : 305 - 314
  • [2] A REAL-TIME TOP-K QUERY ALGORITHM AND PARAL LELIZED IMPLEMENTATION
    Lu, Yao
    Liu, Jun
    2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 160 - 164
  • [3] Evaluating continuous top-k queries over document streams
    Weixiong Rao
    Lei Chen
    Shudong Chen
    Sasu Tarkoma
    World Wide Web, 2014, 17 : 59 - 83
  • [4] Evaluating continuous top-k queries over document streams
    Rao, Weixiong
    Chen, Lei
    Chen, Shudong
    Tarkoma, Sasu
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2014, 17 (01): : 59 - 83
  • [5] A tree-based approach for event prediction using episode rules over event streams
    Cho, Chung-Wen
    Zheng, Ying
    Wu, Yi-Hung
    Chen, Arbee L. P.
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2008, 5181 : 225 - +
  • [6] Enhanced Fast Causal Network Inference over Event Streams
    Acharya, Saurav
    Lee, Byung Suk
    TRANSACTIONS ON LARGE-SCALE DATA- AND KNOWLEDGE- CENTERED SYSTEMS XVII, 2015, 8970 : 45 - 73
  • [7] Sliding Window Top-K Monitoring over Distributed Data Streams
    Lv, Zhijin
    Chen, Ben
    Yu, Xiaohui
    WEB AND BIG DATA, APWEB-WAIM 2017, PT I, 2017, 10366 : 527 - 540
  • [8] Mining top-k high utility patterns over data streams
    Zihayat, Morteza
    An, Aijun
    INFORMATION SCIENCES, 2014, 285 : 138 - 161
  • [9] Real-time prediction of clinical trial enrollment and event counts: A review
    Heitjan, Daniel F.
    Ge, Zhiyun
    Ying, Gui-shuang
    CONTEMPORARY CLINICAL TRIALS, 2015, 45 : 26 - 33
  • [10] Mining top-k frequent patterns over data streams sliding window
    Chen, Hui
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2014, 42 (01) : 111 - 131