Hadoop Distributed File System for Big data analysis

被引:2
作者
Almansouri, Hatim Talal [1 ]
Masmoudi, Youssef [1 ]
机构
[1] Saudi Elect Univ, Riyadh, Saudi Arabia
来源
PROCEEDINGS OF 2019 IEEE 4TH WORLD CONFERENCE ON COMPLEX SYSTEMS (WCCS' 19) | 2019年
关键词
Hadoop; MapReduce; HDFS; DataNode; NameNode; Big Data Analysis;
D O I
10.1109/icocs.2019.8930804
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Hadoop is framework that is processing data with large volume that cannot be processed by conventional systems. Hadoop has management file system called Hadoop Distributed File System (HDFS) that has NameNode and DataNode where the data is divided into blocks based on the total size of dataset. In addition, Hadoop has MapReduce where the dataset is processed in Mapping phase and then reducing phase. Using Hadoop for big data analysis has been revealed important information that can be used for analytical purpose and enabling new products. Big data could be found in many different resources such as social networks, web server logs, broadcast audio streams and banking transactions. In this paper, we illustrated the main steps to setup Hadoop and MapReduce. The illustrated version in this work is the latest released of Hadoop 3.1.1 for big data analysis. A simplified pseudo code is provided to show the functionality of Map class and reduce class. The developed steps are applied with a given example that could be generalized with bigger data.
引用
收藏
页码:257 / 261
页数:5
相关论文
共 11 条
[1]  
Afzali M, 2016, PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, P1856
[2]  
Ahlawat T, 2016, MANAGEMENT APPL SCI, V5, P23
[3]  
Apache Hadoop MapReduce, 2018, MAPREDUCE
[4]  
Homebrew, 2018, HOMEBREW
[5]  
Knoetze R, 2006, COMPUTER SOC, V6, P1
[6]  
Lee G, 2011, AMCIS 2011 P ALL SUB, P1
[7]   Determinants of Mobile Apps' Success: Evidence from the App Store Market [J].
Lee, Gunwoong ;
Raghu, T. S. .
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS, 2014, 31 (02) :133-169
[8]  
Nandhini P, 2018, ENG TECHNOLOGY, V4, P44
[9]  
Pol U, 2016, AM J ENG RES AJER, V5, P146
[10]   High Performance and Fault Tolerant Distributed File System for Big Data Storage and Processing using Hadoop [J].
Sivaraman, E. ;
Manickachezian, R. .
2014 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING APPLICATIONS (ICICA 2014), 2014, :32-36