Hierarchical Sampling from Sketches: Estimating Functions over Data Streams

被引:0
|
作者
Sumit Ganguly
Lakshminath Bhuvanagiri
机构
[1] Indian Institute of Technology,
[2] Google Inc.,undefined
来源
Algorithmica | 2009年 / 53卷
关键词
Data streams; Frequency moments; Entropy; HSS;
D O I
暂无
中图分类号
学科分类号
摘要
We present a randomized procedure named Hierarchical Sampling from Sketches (Hss) that can be used for estimating a class of functions over the frequency vector f of update streams of the form \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\varPsi(\mathcal {S})=\sum_{i=1}^{n}\psi(\vert {f_{i}}\vert )$\end{document} . We illustrate this by applying the Hss technique to design nearly space-optimal algorithms for estimating the pth moment of the frequency vector, for real p≥2 and for estimating the entropy of a data stream.
引用
收藏
页码:549 / 582
页数:33
相关论文
共 50 条
  • [21] Efficient sampling of non-strict turnstile data streams
    Barkay, Neta
    Porat, Ely
    Shalem, Bar
    THEORETICAL COMPUTER SCIENCE, 2015, 590 : 106 - 117
  • [22] Finding frequent itemsets over online data streams
    Chang, Joong Hyuk
    Lee, Won Suk
    INFORMATION AND SOFTWARE TECHNOLOGY, 2006, 48 (07) : 606 - 618
  • [23] Discovering Frequent Tree Patterns over Data Streams
    Hsieh, Mark Cheng-Enn
    Wu, Yi-Hung
    Chen, Arbee L. P.
    PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 629 - +
  • [24] Sketching asynchronous data streams over sliding windows
    Bojian Xu
    Srikanta Tirthapura
    Costas Busch
    Distributed Computing, 2008, 20 : 359 - 374
  • [25] Clustering Data Streams over Sliding Windows by DCA
    Ta Minh Thuy
    Le Thi Hoai An
    Boudjeloud-Assala, Lydia
    ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2013, 479 : 65 - 75
  • [26] Mining frequent closed itemsets from a landmark window over online data streams
    Liu, Xuejun
    Guan, Jihong
    Hu, Ping
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2009, 57 (06) : 927 - 936
  • [27] Concept Shift Detection for Frequent Itemsets from Sliding Windows over Data Streams
    Koh, Jia-Ling
    Lin, Ching-Yi
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2009, 5667 : 334 - 348
  • [28] Interactive discovery of association rules over data streams
    Shin, Se Jung
    Lee, Won Suk
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2014, 29 (05): : 341 - 352
  • [29] Efficient Predicate Matching over Continuous Data Streams
    Kim, Hyeon-Gyu
    Kang, Woo-Lam
    Lee, Yoon-Joon
    Kim, Myoung-Ho
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (09): : 1787 - 1790
  • [30] Sketching asynchronous data streams over sliding windows
    Xu, Bojian
    Tirthapura, Srikanta
    Busch, Costas
    DISTRIBUTED COMPUTING, 2008, 20 (05) : 359 - 374