Hierarchical Sampling from Sketches: Estimating Functions over Data Streams

被引:0
|
作者
Sumit Ganguly
Lakshminath Bhuvanagiri
机构
[1] Indian Institute of Technology,
[2] Google Inc.,undefined
来源
Algorithmica | 2009年 / 53卷
关键词
Data streams; Frequency moments; Entropy; HSS;
D O I
暂无
中图分类号
学科分类号
摘要
We present a randomized procedure named Hierarchical Sampling from Sketches (Hss) that can be used for estimating a class of functions over the frequency vector f of update streams of the form \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\varPsi(\mathcal {S})=\sum_{i=1}^{n}\psi(\vert {f_{i}}\vert )$\end{document} . We illustrate this by applying the Hss technique to design nearly space-optimal algorithms for estimating the pth moment of the frequency vector, for real p≥2 and for estimating the entropy of a data stream.
引用
收藏
页码:549 / 582
页数:33
相关论文
共 50 条
  • [31] Scalable Contrast Pattern Mining over Data Streams
    Alipourchavary, Elaheh
    Erfani, Sarah M.
    Leckie, Christopher
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 2842 - 2846
  • [32] Temporal Coalescing on Window Extents over Data Streams
    Al-Kateb, Mohammed
    Kunta, Sasi Sekhar
    Lee, Byung Suk
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (03): : 489 - 503
  • [33] Multiple ontinuous Queries Evaluation over Data Streams
    Park, Hong Kyu
    Lee, Won Suk
    PROCEEDINGS OF THE 8TH WSEAS INTERNATIONAL CONFERENCE ON APPLIED COMPUTER SCIENCE (ACS'08): RECENT ADVANCES ON APPLIED COMPUTER SCIENCE, 2008, : 346 - +
  • [34] Association Rules Mining over Data Streams: Review
    Tan, Jun
    ADVANCES IN CIVIL ENGINEERING II, PTS 1-4, 2013, 256-259 : 2890 - 2893
  • [35] Universal and Accurate Sketch for Estimating Heavy Hitters and Moments in Data Streams
    Xiao, Qingjun
    Cai, Xuyuan
    Qin, Yifei
    Tang, Zhiying
    Chen, Shigang
    Liu, Yu
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (05) : 1919 - 1934
  • [36] Efficient Window Processing over Disordered Data Streams
    Kim, Hyeon-Gyu
    Kang, Woo-Lam
    Kim, Myoung-Ho
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (03): : 635 - 638
  • [37] Estimating Functions of Distributions Defined over Spaces of Unknown Size
    Wolpert, David H.
    DeDeo, Simon
    ENTROPY, 2013, 15 (11): : 4668 - 4699
  • [38] Incrementally Mining Recently Repeating Patterns over Data Streams
    Koh, Jia-Ling
    Chou, Pei-Min
    NEW FRONTIERS IN APPLIED DATA MINING, 2009, 5433 : 26 - 37
  • [39] A Fast and Efficient Algorithm for Outlier Detection Over Data Streams
    Hassaan, Mosab
    Maher, Hend
    Gouda, Karam
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (11) : 749 - 756
  • [40] Clustering Heterogeneous Data Streams with Uncertainty over Sliding Window
    Hentech, Houda
    Gouider, Mohammed Salah
    Farhat, Amine
    MODEL AND DATA ENGINEERING, MEDI 2013, 2013, 8216 : 162 - 175