An efficient framework for parallel and continuous frequent item monitoring

被引:18
作者
Zhang, Yu [1 ]
Sun, Yue [2 ]
Zhang, Jianzhong [1 ]
Xu, Jingdong [1 ]
Wu, Ying [1 ]
机构
[1] Nankai Univ, Coll Comp & Control Engn, Tianjin 300071, Peoples R China
[2] Tianjin Acad Agr Sci, Rice Res Inst, Tianjin 300071, Peoples R China
基金
中国国家自然科学基金;
关键词
frequent items; parallel algorithms; weighted data streams; multi-core processors; FINDING FREQUENT; ELEMENTS; STREAMS;
D O I
10.1002/cpe.3182
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In high-speed network monitoring, the ever-growing traffic calls for a high-performance solution for the computation of frequent items. The increasing number of cores in the current commodity multi-core processors opens up new opportunities in parallelization. In this paper, we present a novel precision integrated framework (PRIF) that exploits the great parallel capability of multi-cores to speed up the famous frequent algorithm. PRIF equally distributes the input data stream into sub-threads that use the optimized weighted frequent algorithm to track local frequent items. The items with frequency increments exceeding a pre-defined threshold are sent to a merging thread which is able to return the global continuous epsilon-deficient frequent items. The theoretical correctness and complexity analyses are presented. Experiments with real and synthetic traces confirm the theoretical analyses and demonstrate the excellent performance as well as the effects of parameters and data skewness. The results show that PRIF is able to provide continuous frequent items and near-linear speedup at the cost of greater memory use. Copyright (c) 2013 John Wiley & Sons, Ltd.
引用
收藏
页码:2856 / 2879
页数:24
相关论文
共 37 条
[1]  
Akhter S., 2006, MULTICORE PROGRAMMIN, V1st
[2]  
[Anonymous], 2009, PVLDB
[3]  
[Anonymous], 2007, 2007 ACM SIGMOD INT
[4]   Space-optimal Heavy Hitters with Strong Error Bounds [J].
Berinde, Radu ;
Indyk, Piotr ;
Cormode, Graham ;
Strauss, Martin J. .
PODS'09: PROCEEDINGS OF THE TWENTY-EIGHTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2009, :157-166
[5]  
Bose Prosenjit., 2003, SIROCCO, P33
[6]  
Cafaro M, CRM3322
[7]   Finding frequent items in parallel [J].
Cafaro, Massimo ;
Tempesta, Piergiulio .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (15) :1774-1788
[8]   Flow identification for supporting per-flow queueing [J].
Cao, ZR ;
Wang, Z .
NINTH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS, PROCEEDINGS, 2000, :88-93
[9]   Finding frequent items in data streams [J].
Charikar, M ;
Chen, K ;
Farach-Colton, M .
THEORETICAL COMPUTER SCIENCE, 2004, 312 (01) :3-15
[10]   An improved data stream summary: the count-min sketch and its applications [J].
Cormode, G ;
Muthukrishnan, S .
JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2005, 55 (01) :58-75