Stream cube: An architecture for multi-dimensional analysis of data streams

被引:75
|
作者
Han, JW [1 ]
Chen, YX
Dong, GZ
Pei, H
Wah, BW
Wang, JY
Cai, YD
机构
[1] Univ Illinois, Chicago, IL 60680 USA
[2] Washington Univ, St Louis, MO 63130 USA
[3] Wright State Univ, Dayton, OH 45435 USA
[4] Simon Fraser Univ, Burnaby, BC V5A 1S6, Canada
[5] Tsinghua Univ, Beijing 100084, Peoples R China
基金
美国国家科学基金会;
关键词
29;
D O I
10.1007/s10619-005-3296-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Real-time surveillance systems, telecommunication systems, and other dynamic environments often generate tremendous (potentially infinite) volume of stream data: the volume is too huge to be scanned multiple times. Much of such data resides at rather low level of abstraction, whereas most analysts are interested in relatively high-level dynamic changes (such as trends and outliers). To discover such high-level characteristics, one may need to perform on-line multi-level, multi-dimensional analytical processing of stream data. In this paper, we propose an architecture, called stream-cube, to facilitate on-line, multi-dimensional, multi-level analysis of stream data. For fast online multi-dimensional analysis of stream data, three important techniques are proposed for efficient and effective computation of stream cubes. First, a tilted time frame model is proposed as a multi-resolution model to register time-related data: the more recent data are registered at finer resolution, whereas the more distant data are registered at coarser resolution. This design reduces the overall storage of time-related data and adapts nicely to the data analysis tasks commonly encountered in practice. Second, instead of materializing cuboids at all levels, we propose to maintain a small number of critical layers. Flexible analysis can be efficiently performed based on the concept of observation layer and minimal interesting layer. Third, an efficient stream data cubing algorithm is developed which computes only the layers (cuboids) along a popular path and leaves the other cuboids for query-driven, on-line computation. Based on this design methodology, stream data cube can be constructed and maintained incrementally with a reasonable amount of memory, computation cost, and query response time. This is verified by our substantial performance study. Stream data cube architecture facilitates online analytical processing of stream data. It also forms a preliminary data structure for online stream data mining. The impact of the design and implementation of stream data cube in the context of stream data mining is also discussed in the paper.
引用
收藏
页码:173 / 197
页数:25
相关论文
共 50 条
  • [1] Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams
    Jiawei Han
    Yixin Chen
    Guozhu Dong
    Jian Pei
    Benjamin W. Wah
    Jianyong Wang
    Y. Dora Cai
    Distributed and Parallel Databases, 2005, 18 : 173 - 197
  • [2] An Interactive Interface for Multi-Dimensional Data Stream Analysis
    Marques, Nuno C.
    Santos, Hugo
    Silva, Bruno
    Proceedings 2016 20th International Conference Information Visualisation IV 2016, 2016, : 223 - 229
  • [3] A Multi-Dimensional Analysis and Data Cube for Unstructured Text and Social Media
    Lee, Suan
    Kim, Namsoo
    Kim, Jinho
    2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING (BDCLOUD), 2014, : 761 - 764
  • [4] In Pursuit of Outliers in Multi-dimensional Data Streams
    Sadik, Shiblee
    Gruenwald, Le
    Leal, Eleazar
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 512 - 521
  • [5] BBoxDB streams: scalable processing of multi-dimensional data streams
    Nidzwetzki, Jan Kristof
    Gueting, Ralf Hartmut
    DISTRIBUTED AND PARALLEL DATABASES, 2022, 40 (2-3) : 559 - 625
  • [6] BBoxDB streams: scalable processing of multi-dimensional data streams
    Jan Kristof Nidzwetzki
    Ralf Hartmut Güting
    Distributed and Parallel Databases, 2022, 40 : 559 - 625
  • [7] Online multi-dimensional regression analysis on concept-drifting data streams
    Nadungodage, Chandima Hewa
    Xia, Yuni
    Vaidya, Pranav S.
    Chen, Yu
    Lee, Jaehwan John
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2014, 6 (03) : 217 - 238
  • [8] Multi-dimensional uncertain data stream clustering algorithm
    Luo, Qinghua
    Peng, Yu
    Peng, Xiyuan
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2013, 34 (06): : 1330 - 1338
  • [9] Forecasting the Data Cube: A Model Configuration Advisor for Multi-Dimensional Data Sets
    Fischer, Ulrike
    Schildt, Christopher
    Hartmann, Claudio
    Lehner, Wolfgang
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 853 - 864
  • [10] Mining multi-dimensional frequent patterns without data cube construction
    Li, Chuan
    Tang, Changjie
    Yu, Zhonghua
    Liu, Yintian
    Zhang, Tianqing
    Liu, Qihong
    Zhu, Mingfang
    Jiang, Yongguang
    PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 251 - 260