Adaptive clusters and histograms over data streams

被引:0
|
作者
Puttagunta, V [1 ]
Kalpakis, K [1 ]
机构
[1] Univ Maryland Baltimore Cty, Dept Comp Sci & Elect Engn, Baltimore, MD 21250 USA
来源
IKE '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE ENGINEERING | 2005年
关键词
non-stationary streams; adaptive clusters; forgetting factors; histograms; query processing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Incremental clustering and histograms over data streams have wide applications. Data streams that are non-stationary demand that they be adaptive in addition to being incremental. By adaptive, we mean that they reflect properties of data from the recent past. We discuss approaches to adaptive stream computations and advocate the use of forgetting-factors where data is associated with a weight that decays with time. We present the weighted k-means clustering algorithm using forgetting factors that does adaptive clustering over data streams. The main advantage of this algorithm is its simplicity and user friendliness. It allows users to dynamically change the number of clusters as well as the decay rates of different clusters depending on their interestingness. Further we show that adaptive multidimensional histograms can be maintained over real-valued data streams using adaptive clusters by treating each cluster as a bucket of the histogram. We observe that the clusters (as well as histograms) adapt well to the changes in the data. Using weighted-count range queries, we demonstrate the effectiveness of our adaptive histograms over non-stationary streams.
引用
收藏
页码:98 / 104
页数:7
相关论文
共 50 条
  • [1] Constructing fading histograms from data streams
    Sebastiao, Raquel
    Gama, Joao
    Mendonca, Teresa
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2014, 3 (01) : 15 - 28
  • [2] Adaptive sampling for geometric problems over data streams
    Hershberger, John
    Suri, Subhash
    COMPUTATIONAL GEOMETRY-THEORY AND APPLICATIONS, 2008, 39 (03): : 191 - 208
  • [3] Adaptive Continuous Query Reoptimization over Data Streams
    Park, Hong Kyu
    Lee, Won Suk
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (07): : 1421 - 1428
  • [4] Adaptive frequency counting over bursty data streams
    Lin, Bill
    Ho, Wai-Shing
    Kao, Ben
    Chui, Chun-Kit
    2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 516 - 523
  • [5] Tracking clusters in evolving data streams over sliding windows
    Zhou, Aoying
    Cao, Feng
    Qian, Weining
    Jin, Cheqing
    KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 15 (02) : 181 - 214
  • [6] Tracking clusters in evolving data streams over sliding windows
    Aoying Zhou
    Feng Cao
    Weining Qian
    Cheqing Jin
    Knowledge and Information Systems, 2008, 15 : 181 - 214
  • [7] Tracking High Quality Clusters over Uncertain Data Streams
    Zhang, Chen
    Gao, Ming
    Zhou, Aoying
    ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 1641 - +
  • [8] Adaptive scheduling for shared window joins over data streams
    Jin C.
    Zhou A.
    Yu J.X.
    Huang J.Z.
    Cao F.
    Frontiers of Computer Science in China, 2007, 1 (4): : 468 - 477
  • [9] QuantTree: Histograms for Change Detection in Multivariate Data Streams
    Boracchi, Giacomo
    Carrera, Diego
    Cervellera, Cristiano
    Maccio, Danilo
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [10] Change detection in learning histograms from data streams
    Sebastiao, Raquel
    Gama, Joao
    PROGRESS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2007, 4874 : 112 - 123