Timon: A Timestamped Event Database for Efficient Telemetry Data Processing and Analytics

被引:13
作者
Cao, Wei [1 ,2 ]
Gao, Yusong [2 ]
Li, Feifei [2 ]
Wang, Sheng [2 ]
Lin, Bingchen [2 ]
Xu, Ke [2 ]
Feng, Xiaojie [2 ]
Wang, Yucong [2 ]
Liu, Zhenjun [2 ]
Zhang, Gejin [2 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
来源
SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA | 2020年
关键词
time series database; cloud computing; data processing system; real-time data analytics; out-of-order events; MAPREDUCE; LATENCY;
D O I
10.1145/3318464.3386136
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the increasing demand for real-time system monitoring and tracking in various contexts, the amount of time-stamped event data grows at an astonishing rate. Analytics on time-stamped events must be real time and the aggregated results need to be accurate even when data arrives out of order. Unfortunately, frequent occurrences of out-of-order data will significantly slow down the processing, and cause a large delay in the query response. Timon is a timestamped event database that aims to support aggregations and handle late arrivals both correctly (i.e., upholding the exactly-once semantics) and efficiently. Our insight is that a broad range of applications can be implemented with data structures and corresponding operators that satisfy associative and commutative properties. Records arriving after the low watermark are appended to Timon directly, allowing aggregations to be performed lazily. To improve query efficiency, Timon maintains a TS-LSM-Tree, which keeps the most recent data in memory and contains a time-partitioning tree on disk for high-volume data accumulated over long time span. Besides, Timon supports materialized aggregation views and correlation analysis across multiple streams. Timon has been successfully deployed at Alibaba Cloud and is a critical building block for Alibaba cloud's continuous monitoring and anomaly analysis infrastructure.
引用
收藏
页码:739 / 753
页数:15
相关论文
共 25 条
  • [1] Akidau T, 2015, PROC VLDB ENDOW, V8, P1792
  • [2] MillWheel: Fault-Tolerant Stream Processing at Internet Scale
    Akidau, Tyler
    Balikov, Alex
    Bekiroglu, Kaya
    Chernyak, Slava
    Haberman, Josh
    Lax, Reuven
    McVeety, Sam
    Mills, Daniel
    Nordstrom, Paul
    Whittle, Sam
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (11): : 1033 - 1044
  • [3] AlibabaCloud, LOGHUB
  • [4] AlibabaCloud, POLARDB
  • [5] Andersen MP, 2016, 14TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES (FAST '16), P39
  • [6] Apache, 2017, STORM
  • [7] Apache, 2008, HBASE
  • [8] Apache, 2008, CASSANDRA
  • [9] Apache, 2011, KAFKA
  • [10] Apache, 2011, OPENTSDB