An Overview of Hadoop MapReduce, Spark, and Scalable Graph Processing Architecture

被引：2

作者：

Talan, Pooja P. ^{[1
]}

Sharma, Kartik U. ^{[1
]}

Nawade, Pratiksha P. ^{[1
]}

Talan, Karishma P. ^{[2
]}

机构：

[1] Prof Ram Meghe Coll Engn & Management, Badnera Amravati, India

[2] KPIT Technol Ltd, Mumbai, Maharashtra, India

来源：

RECENT DEVELOPMENTS IN MACHINE LEARNING AND DATA ANALYTICS | 2019年 / 740卷

关键词：

Big Data; Hadoop MapReduce; Apache Spark; Graph processing;

D O I：

10.1007/978-981-13-1280-9_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In today's technology era, Big Data has become a buzzword. Various frameworks are available in order to process this Big Data. Both Hadoop and Spark are open source framework to process Big Data. Hadoop provides batch processing while Spark supports both batch as well as stream processing, i.e., it is a hybrid processing framework. Both frameworks have their own advantages and drawback. The contribution of this paper is to provide a comparative analysis of Hadoop MapReduce and Apache Spark. In this paper, we also propose a scalable graph processing architecture that could be used to overcome traditional limitations of Hadoop framework.

引用

页码：35 / 42

页数：8

共 50 条

[21] Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data
Hussein, Eslam
Sadiki, Ronewa
Jafta, Yahlieel
Sungay, Muhammad Mujahid
Ajayi, Olasupo
Bagula, Antoine
E-INFRASTRUCTURE AND E-SERVICES FOR DEVELOPING COUNTRIES (AFRICOMM 2019), 2020, 311 : 180 - 185
[22] Towards Online Graph Processing with Spark Streaming
Abughofa, Tariq
Zulkernine, Farhana
2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2787 - 2794
[23] Scalable Concurrency Debugging with Distributed Graph Processing
Zheng, Long
Liao, Xiaofei
Jin, Hai
Zhao, Jieshan
Wang, Qinggang
PROCEEDINGS OF THE 2018 INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO'18), 2018, : 188 - 199
[24] In-Memory Parallel Processing of Massive Remotely Sensed Data Using an Apache Spark on Hadoop YARN Model
Huang, Wei
Meng, Lingkui
Zhang, Dongying
Zhang, Wen
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (01) : 3 - 19
[25] Efficient and Scalable Graph Parallel Processing With Symbolic Execution
Zheng, Long
Liao, Xiaofei
Jin, Hai
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2018, 15 (01)
[26] ScalaGraph: A Scalable Accelerator for Massively Parallel Graph Processing
Yao, Pengcheng
Zheng, Long
Huang, Yu
Wang, Qinggang
Gui, Chuangyi
Zeng, Zhen
Liao, Xiaofei
Jin, Hai
Xue, Jingling
2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), 2022, : 199 - 212
[27] GraphMap: scalable iterative graph processing using NoSQL
Goswami, Sayan
Pokhrel, Ayam
Lee, Kisung
Liu, Ling
Zhang, Qi
Zhou, Yang
JOURNAL OF SUPERCOMPUTING, 2020, 76 (09) : 6619 - 6647
[28] GraphMap: scalable iterative graph processing using NoSQL
Sayan Goswami
Ayam Pokhrel
Kisung Lee
Ling Liu
Qi Zhang
Yang Zhou
The Journal of Supercomputing, 2020, 76 : 6619 - 6647
[29] Spark-based adaptive Mapreduce data processing method for remote sensing imagery
Tan, Xicheng
Di, Liping
Zhong, Yanfei
Yao, Yayu
Sun, Ziheng
Ali, Yahya
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2021, 42 (01) : 171 - 187
[30] An Overview of Medusa: Simplified Graph Processing on GPUs
Zhong, Jianlong
He, Bingsheng
ACM SIGPLAN NOTICES, 2012, 47 (08) : 283 - 284

← 1 2 3 4 5 →