T-Storm: Traffic-aware Online Scheduling in Storm

被引:137
作者
Xu, Jielong [1 ]
Chen, Zhenhua [1 ]
Tang, Jian [1 ]
Su, Sen [2 ]
机构
[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Syracuse, NY 13244 USA
[2] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching, Beijing, Peoples R China
来源
2014 IEEE 34TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2014) | 2014年
基金
美国国家科学基金会;
关键词
Big Data; Stream Data Processing; Storm; Scheduling; Resource Management; MAPREDUCE;
D O I
10.1109/ICDCS.2014.61
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Storm has emerged as a promising computation platform for stream data processing. In this paper, we first show inefficiencies of the current practice of Storm scheduling and challenges associated with applying traffic-aware online scheduling in Storm via experimental results and analysis. Motivated by our observations, we design and implement a new stream data processing system based on Storm, namely, T-Storm. Compared to Storm, T-Storm has the following desirable features: 1) based on runtime states, it accelerates data processing by leveraging effective traffic-aware scheduling for assigning/re-assigning tasks dynamically, which minimizes inter-node and inter-process traffic while ensuring no worker nodes are overloaded; 2) it enables fine-grained control over worker node consolidation such that T-Storm can achieve better performance with even fewer worker nodes; 3) it allows hot-swapping of scheduling algorithms and adjustment of scheduling parameters on the fly; and 4) it is transparent to Storm users (i.e., Storm applications can be ported to run on T-Storm without any changes). We conducted real experiments in a cluster using well-known data processing applications for performance evaluation. Extensive experimental results show that compared to Storm (with the default scheduler), T-Storm can achieve over 84% and 27% speedup on lightly and heavily loaded topologies respectively (in terms of average processing time) with 30% less number of worker nodes.
引用
收藏
页码:535 / 544
页数:10
相关论文
共 21 条
[1]   M3: Stream Processing on Main-Memory MapReduce [J].
Aly, Ahmed M. ;
Sallam, Asmaa ;
Gnanasekaran, Bala M. ;
Long-Van Nguyen-Dinh ;
Aref, Walid G. ;
Ouzzani, Mourad ;
Ghafoor, Arif .
2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, :1253-1256
[2]  
Anderson Q., 2013, Storm real-time processing cookbook: Efficiently process unbounded streams of data in real time
[3]  
Aniello L., P ACM DEBS 2013
[4]  
Backman N., P ACM MAPREDUCE 2012, P1
[5]  
Bhatotia P., P ACM SOCC 2011
[6]  
Borkar V, 2011, PROC INT CONF DATA, P1151, DOI 10.1109/ICDE.2011.5767921
[7]  
Chen FF, 2012, IEEE INFOCOM SER, P1143, DOI 10.1109/INFCOM.2012.6195473
[8]  
Chen G., J INFORM PROCESSING, V20, P65
[9]  
Condie T., P ACM SIGMOD 2010, P1115
[10]  
Dean J., P USENIX OSDI 2004