Revisiting the Design of Data Stream Processing Systems on Multi-Core Processors

被引:32
作者
Zhang, Shuhao [1 ,2 ]
He, Bingsheng [2 ]
Dahlmeier, Daniel [1 ]
Zhou, Amelie Chi [3 ]
Heinze, Thomas [4 ]
机构
[1] SAP Innovat Ctr Singapore, Singapore, Singapore
[2] Natl Univ Singapore, Singapore, Singapore
[3] INRIA, Rennes, France
[4] SAP SE Walldorf, Walldorf, Germany
来源
2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017) | 2017年
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/ICDE.2017.119
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Driven by the rapidly increasing demand for handling real-time data streams, many data stream processing (DSP) systems have been proposed. Regardless of the different architectures of those DSP systems, they are mostly aiming at scaling out using a cluster of commodity machines and built around a number of key design aspects: a) pipelined processing with message passing, b) on-demand data parallelism, and c) JVM based implementation. However, there lacks a study on those key design aspects on modern scale-up architectures, where more CPU cores are being put on the same die, and the on-chip cache hierarchies are getting larger, deeper, and complex. Multiple sockets bring non-uniform memory access (NUMA) effort. In this paper, we revisit the aforementioned design aspects on a modern scale-up server. Specifically, we use a series of applications as micro benchmark to conduct detailed profiling studies on Apache Storm and Flink. From the profiling results, we observe two major performance issues: a) the massively parallel execution model causes serious front-end stalls, which are a major performance bottleneck issue on a single CPU socket, b) the lack of NUMA-aware mechanism causes major drawback on the scalability of DSP systems on multi-socket architectures. Addressing these issues should allow DSP systems to exploit modern scale-up architectures, which also benefits scaling out environments. We present our initial efforts on resolving the above-mentioned performance issues, which have shown up to 3.2x and 3.1x improvement on the performance of Storm and Flink, respectively.
引用
收藏
页码:659 / 670
页数:12
相关论文
共 33 条
[1]   Aurora: a new model and architecture for data stream management [J].
Abadi, DJ ;
Carney, D ;
Cetintemel, U ;
Cherniack, M ;
Convey, C ;
Lee, S ;
Stonebraker, M ;
Tatbul, N ;
Zdonik, S .
VLDB JOURNAL, 2003, 12 (02) :120-139
[2]  
Ailamaki A., 2009, VLDB
[3]  
Aniello L., 2013, DEBS
[4]  
[Anonymous], 2013, SOSP
[5]  
[Anonymous], 2013, ICDE
[6]  
[Anonymous], MIDDLEWARE
[7]  
[Anonymous], 2009, WWW
[8]  
[Anonymous], 2005, CIDR
[9]  
Arasu Arvind., 2004, VLDB
[10]  
Awan A.J., 2015, BDCloud