H2Hadoop: Improving Hadoop Performance Using the Metadata of Related Jobs

被引:5
作者
Alshammari, Hamoud [1 ]
Lee, Jeongkyu [2 ]
Bajwa, Hassan [3 ]
机构
[1] Univ Bridgeport, Dept Comp Sci & Engn, Bridgeport, CT 06604 USA
[2] Univ Bridgeport, Dept Comp Sci, Bridgeport, CT 06604 USA
[3] Univ Bridgeport, Dept Elect Engn, Bridgeport, CT 06604 USA
关键词
BigData; Cloud Computing; Hadoop; H2Hadoop; Hadoop Performance; MapReduce; Text Data; MAPREDUCE PERFORMANCE; BIG DATA; OPTIMIZATION;
D O I
10.1109/TCC.2016.2535261
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud Computing leverages Hadoop framework for processing BigData in parallel. Hadoop has certain limitations that could be exploited to execute the job efficiently. These limitations are mostly because of data locality in the cluster, jobs and tasks scheduling, and resource allocations in Hadoop. Efficient resource allocation remains a challenge in Cloud Computing MapReduce platforms. We propose H2Hadoop, which is an enhanced Hadoop architecture that reduces the computation cost associated with BigData analysis. The proposed architecture also addresses the issue of resource allocation in native Hadoop. H2Hadoop provides a better solution for "text data", such as finding DNA sequence and the motif of a DNA sequence. Also, H2Hadoop provides an efficient Data Mining approach for Cloud Computing environments. H2Hadoop architecture leverages on NameNode's ability to assign jobs to the TaskTrakers (DataNodes) within the cluster. By adding control features to the NameNode, H2Hadoop can intelligently direct and assign tasks to the Data Nodes that contain the required data without sending the job to the whole cluster. Comparing with native Hadoop, H2Hadoop reduces CPU time, number of read operations, and another Hadoop factors.
引用
收藏
页码:1031 / 1040
页数:10
相关论文
共 44 条
[1]  
Alshammari H., 2014, P ASEE
[2]  
Buck J.B., 2011, International Conference for High Performance Computing, Networking, Storage and Analysis (SC), P1, DOI [10.1145/2063384.2063473, DOI 10.1145/2063384.2063473]
[3]   Big Data: A Survey [J].
Chen, Min ;
Mao, Shiwen ;
Liu, Yunhao .
MOBILE NETWORKS & APPLICATIONS, 2014, 19 (02) :171-209
[4]   Improving MapReduce Performance Using Smart Speculative Execution Strategy [J].
Chen, Qi ;
Liu, Cheng ;
Xiao, Zhen .
IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (04) :954-967
[5]  
Condie T., 2010, P 7 USENIX C NETW SY, P1
[6]  
Cuff JA, 2000, PROTEINS, V40, P502, DOI 10.1002/1097-0134(20000815)40:3<502::AID-PROT170>3.0.CO
[7]  
2-Q
[8]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[9]   ReStore: Reusing Results of MapReduce Jobs [J].
Elghandour, Iman ;
Aboulnaga, Ashraf .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (06) :586-597
[10]  
Erodula Kris, 2011, Proceedings of the 2011 Eighth International Conference on Information Technology: New Generations (ITNG), P985, DOI 10.1109/ITNG.2011.169