Hierarchical Sampling from Sketches: Estimating Functions over Data Streams

被引:0
|
作者
Sumit Ganguly
Lakshminath Bhuvanagiri
机构
[1] Indian Institute of Technology,
[2] Google Inc.,undefined
来源
Algorithmica | 2009年 / 53卷
关键词
Data streams; Frequency moments; Entropy; HSS;
D O I
暂无
中图分类号
学科分类号
摘要
We present a randomized procedure named Hierarchical Sampling from Sketches (Hss) that can be used for estimating a class of functions over the frequency vector f of update streams of the form \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\varPsi(\mathcal {S})=\sum_{i=1}^{n}\psi(\vert {f_{i}}\vert )$\end{document} . We illustrate this by applying the Hss technique to design nearly space-optimal algorithms for estimating the pth moment of the frequency vector, for real p≥2 and for estimating the entropy of a data stream.
引用
收藏
页码:549 / 582
页数:33
相关论文
共 50 条
  • [1] Hierarchical Sampling from Sketches: Estimating Functions over Data Streams
    Ganguly, Sumit
    Bhuvanagiri, Lakshminath
    ALGORITHMICA, 2009, 53 (04) : 549 - 582
  • [2] Random sampling algorithms for sliding windows over data streams
    Zhang, LB
    Li, ZH
    Yu, M
    Wang, Y
    Jiang, Y
    PROCEEDINGS OF THE 11TH JOINT INTERNATIONAL COMPUTER CONFERENCE, 2005, : 572 - 575
  • [3] Adaptive stratified reservoir sampling over heterogeneous data streams
    Al-Kateb, Mohammed
    Lee, Byung Suk
    INFORMATION SYSTEMS, 2014, 39 : 199 - 216
  • [4] A performance evaluation of data streams sampling algorithms over a sliding window
    El Sibai, Rayane
    Chabchoub, Yousra
    Demerjian, Jacques
    Chiky, Raja
    Barbar, Kablan
    2018 IEEE MIDDLE EAST AND NORTH AFRICA COMMUNICATIONS CONFERENCE (MENACOMM), 2018, : 211 - 216
  • [5] Estimating hybrid frequency moments of data streams
    Sumit Ganguly
    Mohit Bansal
    Shruti Dube
    Journal of Combinatorial Optimization, 2012, 23 : 373 - 394
  • [6] A New Method for Estimating the Number of Distinct Values over Data streams
    Guo, Longjiang
    Li, Yingshu
    Ren, Meirui
    Zhang, Zhongzhao
    SNPD 2009: 10TH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCES, NETWORKING AND PARALLEL DISTRIBUTED COMPUTING, PROCEEDINGS, 2009, : 71 - +
  • [7] Estimating hybrid frequency moments of data streams
    Ganguly, Sumit
    Bansal, Mohit
    Dube, Shruti
    JOURNAL OF COMBINATORIAL OPTIMIZATION, 2012, 23 (03) : 373 - 394
  • [8] Computing Hierarchical Summary of the Data Streams
    Shah, Zubair
    Mahmood, Abdun Naser
    Barlow, Michael
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT II, 2016, 9652 : 168 - 179
  • [9] Estimating Multilevel Models on Data Streams
    Ippel, L.
    Kaptein, M. C.
    Vermunt, J. K.
    PSYCHOMETRIKA, 2019, 84 (01) : 41 - 64
  • [10] Estimating Multilevel Models on Data Streams
    L. Ippel
    M. C. Kaptein
    J. K. Vermunt
    Psychometrika, 2019, 84 : 41 - 64