Residual Traffic Based Task Scheduling in Hadoop

被引:0
作者
Tanaka, Daichi [1 ]
Kawarasaki, Masatoshi [2 ]
机构
[1] Univ Tsukuba, Grad Sch Lib Informat & Media Studies, Tsukuba, Ibaraki, Japan
[2] Univ Tsukuba, Fac Lib Informat & Media Sci, Tsukuba, Ibaraki, Japan
来源
CLOUD COMPUTING 2015: THE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, GRIDS, AND VIRTUALIZATION | 2015年
关键词
distributed computing; Hadoop; MapReduce; job performance; network simulation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In Hadoop job processing, it is reported that a large amount of data transfer significantly influences job performance. In this paper, we clarify that the cause of performance deterioration in the CPU (Central Processing Unit) heterogeneous environment is the delay of copy phase due to the heavy load in the inter rack links of the cluster network. Thus, we propose a new scheduling method-Residual Traffic Based Task Scheduling-that estimates the amount of inter rack data transfer in the copy phase and regulates task assignment accordingly. We evaluate the scheduling method by using ns-3 (network simulator-3) and show that it can improve Hadoop job performance significantly.
引用
收藏
页码:94 / 102
页数:9
相关论文
共 15 条
[1]  
[Anonymous], ACMSE2014
[2]  
[Anonymous], CLUSTER COMPUTING
[3]  
[Anonymous], AP HAD
[4]  
[Anonymous], IEEE T COMPUTERS
[5]  
[Anonymous], ACM SIGARCH COMPUTER
[6]  
[Anonymous], UCBEECS2009183
[7]  
[Anonymous], 2012, Hadoop: The definitive guide
[8]  
[Anonymous], CLOUD COMP COMPSTOR
[9]  
[Anonymous], GRID COOP COMP 2009
[10]   Managing Data Transfers in Computer Clusters with Orchestra [J].
Chowdhury, Mosharaf ;
Zaharia, Matei ;
Ma, Justin ;
Jordan, Michael I. ;
Stoica, Ion .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2011, 41 (04) :98-109