Fast correlation coefficient estimation algorithm for HBase-based massive time series data

被引:1
|
作者
Liu, Wen [1 ,2 ]
Zhang, Tuqian [2 ]
Shen, Yanming [2 ]
Wang, Peng [3 ]
机构
[1] Xinjiang Inst Engn, Dept Elect & Informat Engn, Urumqi 830091, Peoples R China
[2] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116024, Peoples R China
[3] Fudan Univ, Sch Comp Sci, Shanghai 201203, Peoples R China
关键词
time series; HBase; correlation coefficient; fast estimation;
D O I
10.1007/s11704-018-6308-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, the rapid development of Internet of Things and sensor networks makes the time series data experiencing explosive growth. OpenTSDB and other emerging systems begin to use Hadoop, HBase to store massive time series data, and how to use these platforms to query and mine time series data has become a current research hotspot. As a typical time series distance measurement method, correlation coefficient is widely used in various applications. However, it requires a large amount of I/O and network transmission to compute the correlation coefficient of long time sequence on HBase in real time, and therefore cannot be applied to interactive query. To address this problem, in this paper, we present two methods to estimate the correlation coefficients of two sequences on HBase. We first propose a fast estimation algorithm for the upper and lower bounds of correlation coefficient, named as DCE. In order to further reduce the cost of I/O, we extend the DCE algorithm, and propose the ADCE algorithm, which can estimate the correlation coefficient quickly with an iterative manner. Experiments show that the algorithms proposed in this paper can quickly calculate the correlation coefficient of the long time series.
引用
收藏
页码:864 / 878
页数:15
相关论文
共 50 条
  • [31] Fast Motion Estimation Algorithm Based on Real Time Monitoring
    Xu, Xuemei
    Mo, Qin
    Ni, Lan
    Guo, Qiaoyun
    Li, An
    MANUFACTURING SCIENCE AND TECHNOLOGY, PTS 1-8, 2012, 383-390 : 5028 - 5033
  • [32] An algorithm for time series data mining based on clustering
    Wu, Shaozhi
    Wu, Yue
    Wang, Ying
    Ye, Yalan
    2006 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS PROCEEDINGS, VOLS 1-4: VOL 1: SIGNAL PROCESSING, 2006, : 2155 - +
  • [33] Research on massive AIS ship data storage method based on HBase and Elasticsearch
    Ma, Huadong
    Wen, Yubo
    Zhou, Haiying
    Wang, Lei
    2024 5TH INTERNATIONAL CONFERENCE ON GEOLOGY, MAPPING AND REMOTE SENSING, ICGMRS 2024, 2024, : 260 - 263
  • [34] A novel two-dimensional correlation coefficient for assessing associations in time series data
    Dikbas, Fatih
    INTERNATIONAL JOURNAL OF CLIMATOLOGY, 2017, 37 (11) : 4065 - 4076
  • [35] Diagnostic checks in time series models based on a new correlation coefficient of residuals
    Pei, Jian
    Zhu, Fukang
    Li, Qi
    JOURNAL OF APPLIED STATISTICS, 2024, 51 (12) : 2402 - 2419
  • [36] R2Time: a framework to analyse OpenTSDB time-series data in HBase
    Agrawal, Bikash
    Chakravorty, Antorweep
    Rong, Chunming
    Wlodarczyk, Tomasz Wiktor
    2014 IEEE 6TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2014, : 970 - 975
  • [37] Fast image matching based on correlation coefficient
    Department of Optical Engineering, School of Information Science and Technology, Beijing Institute of Technology, Beijing 100081, China
    Beijing Ligong Daxue Xuebao, 2007, 11 (998-1000):
  • [38] A new tendency correlation coefficient for bivariate time series
    Jian Zhou
    Zhongsheng Hua
    Rendiconti Lincei. Scienze Fisiche e Naturali, 2021, 32 : 479 - 491
  • [39] COEFFICIENT OF DIRECTIONAL CORRELATION FOR TIME-SERIES ANALYSES
    STRAHAN, RF
    PSYCHOLOGICAL BULLETIN, 1971, 76 (03) : 211 - &
  • [40] Wavelet correlation coefficient of 'strongly correlated' time series
    Razdan, A
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2004, 333 : 335 - 342