H-Tree: An Efficient Index Structure for Event Matching in Content-Based Publish/Subscribe Systems

被引:24
作者
Qian, Shiyou [1 ]
Cao, Jian [1 ]
Zhu, Yanmin [1 ]
Li, Minglu [1 ]
Wang, Jie [2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Stanford Univ, Dept Civil & Environm Engn, Palo Alto, CA 94304 USA
关键词
Content-based publish/subscribe; event matching; index structure; performance; matching time;
D O I
10.1109/TPDS.2014.2323262
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Content-based publish/subscribe systems have been employed to deal with complex distributed information flows in many applications. It is well recognized that event matching is a fundamental component of such large-scale systems. Event matching searches a space which is composed of all subscriptions. As the scale and complexity of a system grows, the efficiency of event matching becomes more critical to system performance. However, most existing methods suffer significant performance degradation when the system has large numbers of both subscriptions and their component constraints. In this paper, we present Hash Tree (H-Tree), a highly efficient index structure for event matching. H-Tree is a hash table in nature that is a combination of hash lists and hash chaining. A hash list is built up on an indexed attribute by realizing novel overlapping divisions of the attribute's value domain, providing more efficient space consumption. Multiple hash lists are then combined into a hash tree. The basic idea behind H-Tree is that matching efficiencies are improved when the search space is substantially reduced by pruning most of the subscriptions that are not matched. We have implemented H-Tree and conducted extensive experiments in different settings. Experimental results demonstrate that H-Tree has better performance than its counterparts by a large margin. In particular, the matching speed is faster by three orders of magnitude than its counterparts when the numbers of both subscriptions and their component constraints are huge.
引用
收藏
页码:1622 / 1632
页数:11
相关论文
共 32 条
  • [1] Aguilera M. K., 1999, Proceedings of the Eighteenth Annual ACM Symposium on Principles of Distributed Computing, P53, DOI 10.1145/301308.301326
  • [2] [Anonymous], 2009, P VLDB ENDOWMENT
  • [3] [Anonymous], 2010, TIBCO RENDEZVOUS 8 3
  • [4] [Anonymous], 1925, Statistical methods for research workers
  • [5] [Anonymous], 2012, IBM WEBSPHERE MQ 7 5
  • [6] Predicate matching and subscription matching in publish/subscribe systems
    Ashayer, G
    Leung, HKY
    Jacobsen, HA
    [J]. 22ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOP, PROCEEDINGS, 2002, : 539 - 546
  • [7] Breslau L, 1999, IEEE INFOCOM SER, P126, DOI 10.1109/INFCOM.1999.749260
  • [8] Efficient filtering in publish-subscribe systems using binary decision diagrams
    Campailla, A
    Chaki, S
    Clarke, E
    Jha, S
    Veith, H
    [J]. PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, 2001, : 443 - 452
  • [9] Carzaniga A, 2003, ACM SIGCOMM COMP COM, V33, P163
  • [10] Design and evaluation of a wide-area event notification service
    Carzaniga, A
    Rosenblum, DS
    Wolf, AL
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2001, 19 (03): : 332 - 383