Space-code bloom filter for efficient per-flow traffic measurement

被引:78
作者
Kumar, Abhishek [1 ]
JimXu, Jun
Wang, Jia
机构
[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
[2] AT&T Labs Res, Dept Network Measurement & Engn Res, Internet & Networking Syst Res Ctr, Florham Pk, NJ 07932 USA
基金
美国国家科学基金会;
关键词
bloom filter (BF); data structures; network measurement; statistical inference; traffic analysis;
D O I
10.1109/JSAC.2006.884032
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Per-flow traffic measurement is critical for usage accounting, traffic engineering, and anomaly detection. Previous methodologies are either based on random sampling (e.g., Cisco's NetFlow), which is inaccurate, or only account for the "elephants." We introduce a novel technique for measuring per-flow traffic approximately, for all flows regardless of their sizes, at very high-speed (say, OC768). The core of this technique is a novel data structure called Space-Code Bloom Filter (SCBF). A SCBF is an approximate representation of a multiset; each element in this multiset is a traffic flow and its multiplicity is the number of packets in the flow. The multiplicity of an element in the multiset represented by SCBF can be estimated through either of two mechanisms-maximum-likelihood estimation or mean value estimation. Through parameter tuning, SCBF allows for graceful tradeoff between measurement accuracy and computational and storage complexity. SCBF also contributes to the foundation of data streaming by introducing a new paradigm called blind streaming. We evaluate the performance of SCBF through mathematical analysis and through experiments on packet traces gathered from a tier-1 ISP backbone. Our results demonstrate that SCBF achieves reasonable measurement accuracy with very low storage and computational complexity. We also demonstrate the application of SCBF in estimating the frequency of keywords at a search engine-demonstrating the applicability of SCBF to other problems that can be reduced to multiset membership queries.
引用
收藏
页码:2327 / 2339
页数:13
相关论文
共 21 条
[1]  
Alon N., 1996, Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, P20, DOI 10.1145/237814.237823
[2]  
Bickel PJ., 2001, Mathematical Statistics: Basic Ideas and Selected Topics
[3]   SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS [J].
BLOOM, BH .
COMMUNICATIONS OF THE ACM, 1970, 13 (07) :422-&
[4]  
Broder A., 2004, INTERNET MATH, V1, P485, DOI DOI 10.1080/15427951.2004.10129096
[5]  
Charikar M, 2002, LECT NOTES COMPUT SC, V2380, P693
[6]  
Cohen S., 2003, ACMSIGMOD INT C MANA, P241, DOI DOI 10.1145/872757.872787
[7]  
DEMAINE ED, 2002, LECT NOTES COMPUTER
[8]  
Duffield N, 2001, IMW 2001: PROCEEDINGS OF THE FIRST ACM SIGCOMM INTERNET MEASUREMENT WORKSHOP, P245
[9]  
ESTAN C, 2002, P ACM SIGCOMM PITTSB, P323
[10]  
Estan Cristian., 2003, Internet Measurement Conference, P153, DOI DOI 10.1145/948205