Mnemonic Lossy Counting: An Efficient and Accurate Heavy-hitters Identification Algorithm

被引:4
作者
Rong, Qiong [1 ]
Zhang, Guangxing [1 ]
Xie, Gaogang [1 ]
Salamatian, Kave [2 ]
机构
[1] Chinese Acad Sci, Network Technol Res Ctr, Inst Comp Technol, POB 2704, Beijing 100190, Peoples R China
[2] Univ Savoie, Polytech Annecy Chambery, LISTIC, F-74944 Annecy Le Vieux, France
来源
2010 IEEE 29TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC) | 2010年
关键词
traffic measurements; heavy-hitters; historical information; mnemonic; ELEMENTS; FREQUENT;
D O I
10.1109/PCCC.2010.5682303
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Identifying heavy-hitter traffic flows efficiently and accurately is essential for Internet security, accounting and traffic engineering. However, finding all heavy-hitters might require large memory for storage of flows information that is incompatible with the usage of fast and small memory. Moreover, upcoming 100Gbps transmission rates make this recognition more challenging. How to improve the accuracy of heavy-hitters identification with limited memory space has become a critical issue. This paper presents a scalable algorithm named Mnemonic Lossy Counting (MLC) that improves the accuracy of heavy-hitters identification while having a reasonable time and space complexity. MLC algorithm holds potential candidate heavy-hitters in a historical information table. This table is used to obtain tighter error bounds on the estimated sizes of candidate heavy-hitters. We validate the MLC algorithm using real network traffic traces, and we compared its performance with two state-of-the-art algorithms, namely Lossy Counting (LC) and Probabilistic Lossy Counting (PLC). The results reveal that: 1) with same set of parameters and memory usage, MLC achieves between 31.5% and 6.67% fewer false positives than LC and PLC. 2) MLC and LC have a zero false negative ratio, whereas 38% of the cases PLC has a non-zero false negatives and PLC can miss up to 4.4% of heavy-hitters. 3) MLC has a slightly lower memory cost than LC during the first few windows and its memory usage decreases with time, when PLC memory usage declines sharply. 4) MLC has similar runtime than LC, and smaller time than PLC.
引用
收藏
页码:255 / 262
页数:8
相关论文
共 24 条
  • [1] ADAMIC L. A., 2002, Glottometrics, V3, P143, DOI DOI 10.1109/S0SE.2014.50
  • [2] [Anonymous], 2002, VERY LARGE DATA BASE
  • [3] [Anonymous], 25 IEEE INT C COMP C
  • [4] BARAKAT C, 2005, P CONEXT
  • [5] How to scalably and accurately skip past streams
    Bhattacharyya, Supratik
    Madeira, Andre
    Muthukrishnan, S.
    Ye, Tao
    [J]. 2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1-2, 2007, : 654 - 663
  • [6] CHARIKAR M, 2002, P INT C AT LANG PROG
  • [7] An improved data stream summary: the count-min sketch and its applications
    Cormode, G
    Muthukrishnan, S
    [J]. JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2005, 55 (01): : 58 - 75
  • [8] Demaine ED, 2002, LECT NOTES COMPUT SC, V2461, P348
  • [9] Dimitropoulos X, 2008, ACM SIGCOMM COMP COM, V38, P7, DOI 10.1145/1341431.1341433
  • [10] DUFFIELD N, 2004, SIGMETRICS 04 PERFOR, P85