Improving Scheduling Efficiency of Hadoop YARN Using AFSA Algorithm

被引:1
作者
Gao Junlei [1 ]
Tang Tie [1 ]
Wu Gangshan [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China
来源
2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017) | 2017年
关键词
YARN; Scheduler; ASAF; Hadoop;
D O I
10.1109/ISPA/IUCC.2017.00141
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Apache Hadoop is one of the most popular MapReduce framework for parallel processing of large data sets. As the job scheduler and resource manager, YARN plays a very important role. Schedulers on YARN are designed to minimize the makespan of MapReduce jobs. The performance of a scheduler in YARN depends not only on whether the resource capacity of the working nodes are fully utilized, but also on the dependencies among those tasks. Therefore it is very difficult to achieve an optimal solution. This paper proposes a new Hadoop YARN scheduling algorithm. The algorithm formalizes the problem as a multiple knapsack problem which takes into consideration of the resource cost and time cost of each task as well as the dependency between different tasks. Artificial Fish Swarm Algorithm is adopted to solve the knapsack optimization problem. The algorithm was implemented as a pluggable scheduler on the most recent version of Hadoop YARN and evaluated with several MapReduce benchmarks. The experimental results show that our scheduler could effectively reduce the makespan of Hadoop jobs by 30% compared with some existing scheduling policies.
引用
收藏
页码:919 / 924
页数:6
相关论文
共 16 条
[1]  
[Anonymous], P 4 ANN S CLOUD COMP
[2]  
[Anonymous], 2012, Hadoop: The definitive guide
[3]   Improved binary artificial fish swarm algorithm for the 0-1 multidimensional knapsack problems [J].
Azad, Md. Abul Kalam ;
Rocha, Ana Maria A. C. ;
Fernandes, Edite M. G. P. .
SWARM AND EVOLUTIONARY COMPUTATION, 2014, 14 :66-75
[4]   Resource-constrained multi-project scheduling: Priority rule performance revisited [J].
Browning, Tyson R. ;
Yassine, Ali A. .
INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2010, 126 (02) :212-228
[5]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[6]  
Gupta Shekhar., 2013, Proceedings of the 10th International Conference on Autonomic Computing (ICAC 13), P159
[7]  
Isard M, 2009, SOSP'09: PROCEEDINGS OF THE TWENTY-SECOND ACM SIGOPS SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, P261
[8]  
Kc K., 2010, Proceedings of the 2010 IEEE 2nd International Conference on Cloud Computing Technology and Science (CloudCom 2010), P388, DOI 10.1109/CloudCom.2010.97
[9]   Artificial fish swarm algorithm: a survey of the state-of-the-art, hybridization, combinatorial and indicative applications [J].
Neshat, Mehdi ;
Sepidnam, Ghodrat ;
Sargolzaei, Mehdi ;
Toosi, Adel Najaran .
ARTIFICIAL INTELLIGENCE REVIEW, 2014, 42 (04) :965-997
[10]  
Pastorelli Mario, 2013, 2013 IEEE International Conference on Big Data, P51, DOI 10.1109/BigData.2013.6691554