IDP: An Innovative Data Placement Algorithm for Hadoop Systems

被引:2
作者
Lee, Chia-Wei [1 ]
Huang, Horng-Chyau [1 ]
Hsieh, Sun-Yuan [1 ,2 ,3 ]
机构
[1] Natl Cheng Kung Univ, Inst Med Informat, 1 Univ Rd, Tainan 701, Taiwan
[2] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 701, Taiwan
[3] Natl Cheng Kung Univ, Inst Mfg Informat & Syst, Tainan 701, Taiwan
来源
INTELLIGENT SYSTEMS AND APPLICATIONS (ICS 2014) | 2015年 / 274卷
关键词
Data Placement; Hadoop; Heterogeneous; MapReduce;
D O I
10.3233/978-1-61499-484-8-49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a data placement strategy to deal with the imbalanced workload problem on DataNodes. Basing on computing capability of each node in a heterogeneous Hadoop cluster, the proposed strategy can balance the data that was stored in the DataNode such that the cost of data transfer time can be tremendously reduced. As a result, the Hadoop overall performance can be greatly improved. Experimental results demonstrate that the proposed data placement strategy can highly decrease the execution time and thus improves Hadoop performance in a heterogeneous cluster.
引用
收藏
页码:49 / 58
页数:10
相关论文
共 13 条
[1]  
[Anonymous], 2008, 8 USENIX S OP SYST D
[2]   Mars: A MapReduce Framework on Graphics Processors [J].
He, Bingsheng ;
Fang, Wenbin ;
Luo, Qiong ;
Govindaraju, Naga K. ;
Wang, Tuyong .
PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, :260-269
[3]  
Isard M., 2007, Operating Systems Review, V41, P59, DOI 10.1145/1272998.1273005
[4]  
Kavulya Soila, 2010, Proceedings 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), P94, DOI 10.1109/CCGRID.2010.112
[5]   A Dynamic Data Placement Strategy for Hadoop in Heterogeneous Environments [J].
Lee, Chia-Wei ;
Hsieh, Kuang-Yu ;
Hsieh, Sun-Yuan ;
Hsiao, Hung-Chang .
BIG DATA RESEARCH, 2014, 1 :14-22
[6]  
Lee G., 2011, IEEE INT C CLOUD COM, P4
[7]  
Majors James., 2010, IEEE International Symposium on Parallel Distributed Processing, P1
[8]  
Myint J., 2011, INT J CLOUD COMPUTIN, V1, P31, DOI DOI 10.5121/ijccsa.2011.1303
[9]  
Rafique M. Mustafa, 2009, Operating Systems Review, V43, P25, DOI 10.1145/1531793.1531800
[10]  
Ranger C, 2007, INT S HIGH PERF COMP, P13