SALSA: Self-Adjusting Lean Streaming Analytics

被引:16
作者
Ben Basat, Ran [1 ]
Einziger, Gil [2 ]
Mitzenmacher, Michael [3 ]
Vargaftik, Shay [4 ]
机构
[1] UCL, London, England
[2] Ben Gurion Univ Negev, Beer Sheva, Israel
[3] Harvard Univ, Cambridge, MA 02138 USA
[4] VMware Res, Seattle, WA USA
来源
2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021) | 2021年
关键词
SKETCH;
D O I
10.1109/ICDE51399.2021.00080
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Counters are the fundamental building block of many data sketching schemes, which hash items to a small number of counters and account for collisions to provide good approximations for frequencies and other measures. Most existing methods rely on fixed-size counters, which may be wasteful in terms of space, as counters must be large enough to eliminate any risk of overflow. Instead, some solutions use small, fixed-size counters that may overflow into secondary structures. This paper takes a different approach. We propose a simple and general method called SALSA for dynamic re-sizing of counters, and show its effectiveness. SALSA starts with small counters, and overflowing counters simply merge with their neighbors. SALSA can thereby allow more counters for a given space, expanding them as necessary to represent large numbers. Our evaluation demonstrates that, at the cost of a small overhead for its merging logic, SALSA significantly improves the accuracy of popular schemes (such as Count-Min Sketch and Count Sketch) over a variety of tasks. Our code is released as open source [1].
引用
收藏
页码:864 / 875
页数:12
相关论文
共 46 条
[1]  
[Anonymous], 1908, BIOMETRIKA, V6, P1
[2]  
[Anonymous], 2018, The CAIDA equinix-newyork packet trace
[3]  
[Anonymous], 2018, ACM CONEXT
[4]  
[Anonymous], 2016, CAIDA EQUINIX CHICAG
[5]  
[Anonymous], 2018, TRENDING YOUTUBE VID
[6]  
[Anonymous], 2020, SALSA OPEN SOURCE CO
[7]  
[Anonymous], 2002, SPECTS
[8]  
[Anonymous], CAFFEINE HIGH PERFOR
[9]  
Basat R. B., 2020, IEEE INFOCOM
[10]  
Basat R. B., 2018, PERVASIVE MOB COMPUT