Hadoop, MapReduce and HDFS: A Developers Perspective

被引:72
作者
Ghazi, Mohd Rehan [1 ]
Gangodkar, Durgaprasad [1 ]
机构
[1] Graph Era Univ, Dehra Dun 248002, Uttar Pradesh, India
来源
INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONVERGENCE (ICCC 2015) | 2015年 / 48卷
关键词
Hadoop; HDFS; MapReduce; JobTracker; TaskTracker; NameNode; DataNode;
D O I
10.1016/j.procs.2015.04.108
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The applications running on Hadoop clusters are increasing day by day. This is due to the fact that organizations have found a simple and efficient model that works well in distributed environment. The model is built to work efficiently on thousands of machines and massive data sets using commodity hardware. HDFS and MapReduce is a scalable and fault-tolerant model that hides all the complexities for Big Data analytics. Since Hadoop is becoming increasingly popular, understanding technical details becomes essential. This fact inspired us to explore Hadoop and its components in-depth. The process of analysing, examining and processing huge amount of unstructured data to extract required information has been a challenge. In this paper we discuss Hadoop and its components in detail which comprise of MapReduce and Hadoop Distributed File System (HDFS). MapReduce engine uses JobTracker and TaskTracker that handle monitoring and execution of job. HDFS a distributed file-system which comprise of NameNode, DataNode and Secondary NameNode for efficient handling of distributed storage purpose. The details provided can be used for developing large scale distributed applications that can exploit computational power of multiple nodes for data and compute intensive applications. (C) 2015 The Authors. Published by Elsevier B.V.
引用
收藏
页码:45 / 50
页数:6
相关论文
共 22 条
[1]  
[Anonymous], 2010, SYNTHESIS LECT HUMAN, DOI DOI 10.2200/S00274ED1V01Y201006HLT007
[2]  
[Anonymous], 2012, Hadoop: The definitive guide
[3]  
[Anonymous], 2003, P 19 ACM S OP SYST P, DOI [10.1145/1165389.945450, DOI 10.1145/1165389.945450]
[4]  
Benslimane Z., 2013, P 2 INT C ADV EL EL, P63
[5]  
Borthakur D., HADOOP DISTRIBUTED F
[6]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[7]  
Ekanayake J., 2008, eScience, P277, DOI DOI 10.1109/ESCIENCE.2008.59
[8]  
Elsayed Abdelrahman, 2014, International Journal of Computer and Electrical Engineering, V6, P34, DOI 10.7763/IJCEE.2014.V6.789
[9]  
Faghri F, 2012, P WORKSH SEC DEP MID, P5, DOI DOI 10.1145/2405186
[10]  
Gray, 2012, P MIDST C UND RES CO