H-Tree: An Efficient Index Structure for Event Matching in Content-Based Publish/Subscribe Systems

被引:24
作者
Qian, Shiyou [1 ]
Cao, Jian [1 ]
Zhu, Yanmin [1 ]
Li, Minglu [1 ]
Wang, Jie [2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Stanford Univ, Dept Civil & Environm Engn, Palo Alto, CA 94304 USA
关键词
Content-based publish/subscribe; event matching; index structure; performance; matching time;
D O I
10.1109/TPDS.2014.2323262
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Content-based publish/subscribe systems have been employed to deal with complex distributed information flows in many applications. It is well recognized that event matching is a fundamental component of such large-scale systems. Event matching searches a space which is composed of all subscriptions. As the scale and complexity of a system grows, the efficiency of event matching becomes more critical to system performance. However, most existing methods suffer significant performance degradation when the system has large numbers of both subscriptions and their component constraints. In this paper, we present Hash Tree (H-Tree), a highly efficient index structure for event matching. H-Tree is a hash table in nature that is a combination of hash lists and hash chaining. A hash list is built up on an indexed attribute by realizing novel overlapping divisions of the attribute's value domain, providing more efficient space consumption. Multiple hash lists are then combined into a hash tree. The basic idea behind H-Tree is that matching efficiencies are improved when the search space is substantially reduced by pruning most of the subscriptions that are not matched. We have implemented H-Tree and conducted extensive experiments in different settings. Experimental results demonstrate that H-Tree has better performance than its counterparts by a large margin. In particular, the matching speed is faster by three orders of magnitude than its counterparts when the numbers of both subscriptions and their component constraints are huge.
引用
收藏
页码:1622 / 1632
页数:11
相关论文
共 32 条
  • [11] Load Balancing Content-Based Publish/Subscribe Systems
    Cheung, Alex King Yeung
    Jacobsen, Hans-Arno
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2010, 28 (04):
  • [12] The JEDI event-based infrastructure and its application to the development of the OPSSWFMS
    Cugola, G
    Di Nitto, E
    Fuggetta, A
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2001, 27 (09) : 827 - 850
  • [13] The many faces of publish/subscribe
    Eugster, PT
    Felber, PA
    Guerraoui, R
    Kermarrec, AM
    [J]. ACM COMPUTING SURVEYS, 2003, 35 (02) : 114 - 131
  • [14] Fabret F, 2001, SIGMOD RECORD, V30, P115, DOI 10.1145/376284.375677
  • [15] Fabret F., 2000, P COOPLS, P1
  • [16] Fontoura Marcus., 2010, P 2010 INT C MANAGEM, P3
  • [17] Efficient event matching in publish/subscribe: Based on routing destination and matching history
    Guo, Xiangfeng
    Wei, Jun
    Han, Dongli
    [J]. PROCEEDINGS OF THE 2008 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE, AND STORAGE, 2008, : 129 - +
  • [18] Jafarpour H., 2009, P 3 ACM INT C DISTR, P7
  • [19] Split and Subsume Subscription Normalization for Effective Content-based Messaging
    Jayaram, K. R.
    Eugster, Patrick
    [J]. 31ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2011), 2011, : 824 - 835
  • [20] Jerzak Z., 2008, Proc. of DEBS '08, P71, DOI DOI 10.1145/1385989.1385999