Data Sketching

被引:43
作者
Cormode, Graham [1 ,2 ,3 ]
机构
[1] Univ Warwick, Comp Sci, Coventry, W Midlands, England
[2] Bell Labs, 600 Mt Ave, Murray Hill, NJ 07974 USA
[3] AT&T Algorithms Data Management, Bedminster, NJ USA
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1145/3080008
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This article introduces the ideas behind sketching, with a focus on algorithmic innovations. It describes some algorithmic developments in the abstract, followed by the steps needed to put them into practice, with examples. It also looks at four novel algorithmic ideas and discusses some emerging areas. The possibility of false positives needs to be handled carefully. Bloom filters are at their most attractive when the consequence of a false positive is not the introduction of an error in a computation, but rather when it causes some additional work that does not adversely impact the overall performance of the system.
引用
收藏
页码:48 / 55
页数:8
相关论文
共 16 条
  • [1] Ahn K.J., 2012, P ACM SIAM S DISCR A
  • [2] [Anonymous], 2005, Scientific Programming
  • [3] [Anonymous], P INT C MACH LEARN
  • [4] [Anonymous], 2003, Internet Mathematics, DOI DOI 10.1080/15427951.2004.10129096
  • [5] Bhagat S., 2016, 3 AND A HALF DEGR SE
  • [6] SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS
    BLOOM, BH
    [J]. COMMUNICATIONS OF THE ACM, 1970, 13 (07) : 422 - &
  • [7] Clarkson KL, 2013, STOC'13: PROCEEDINGS OF THE 2013 ACM SYMPOSIUM ON THEORY OF COMPUTING, P81
  • [8] An improved data stream summary: the count-min sketch and its applications
    Cormode, G
    Muthukrishnan, S
    [J]. JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2005, 55 (01): : 58 - 75
  • [9] Cormode G., 2004, P 2004 ACM SIGMOD IN, P35
  • [10] Flajolet P., 1985, J COMPUTER SYSTEM SC, p[76, 182]