An Overview of Hadoop MapReduce, Spark, and Scalable Graph Processing Architecture

被引:2
|
作者
Talan, Pooja P. [1 ]
Sharma, Kartik U. [1 ]
Nawade, Pratiksha P. [1 ]
Talan, Karishma P. [2 ]
机构
[1] Prof Ram Meghe Coll Engn & Management, Badnera Amravati, India
[2] KPIT Technol Ltd, Mumbai, Maharashtra, India
来源
RECENT DEVELOPMENTS IN MACHINE LEARNING AND DATA ANALYTICS | 2019年 / 740卷
关键词
Big Data; Hadoop MapReduce; Apache Spark; Graph processing;
D O I
10.1007/978-981-13-1280-9_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In today's technology era, Big Data has become a buzzword. Various frameworks are available in order to process this Big Data. Both Hadoop and Spark are open source framework to process Big Data. Hadoop provides batch processing while Spark supports both batch as well as stream processing, i.e., it is a hybrid processing framework. Both frameworks have their own advantages and drawback. The contribution of this paper is to provide a comparative analysis of Hadoop MapReduce and Apache Spark. In this paper, we also propose a scalable graph processing architecture that could be used to overcome traditional limitations of Hadoop framework.
引用
收藏
页码:35 / 42
页数:8
相关论文
共 50 条
  • [21] Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data
    Hussein, Eslam
    Sadiki, Ronewa
    Jafta, Yahlieel
    Sungay, Muhammad Mujahid
    Ajayi, Olasupo
    Bagula, Antoine
    E-INFRASTRUCTURE AND E-SERVICES FOR DEVELOPING COUNTRIES (AFRICOMM 2019), 2020, 311 : 180 - 185
  • [22] Towards Online Graph Processing with Spark Streaming
    Abughofa, Tariq
    Zulkernine, Farhana
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2787 - 2794
  • [23] Scalable Concurrency Debugging with Distributed Graph Processing
    Zheng, Long
    Liao, Xiaofei
    Jin, Hai
    Zhao, Jieshan
    Wang, Qinggang
    PROCEEDINGS OF THE 2018 INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO'18), 2018, : 188 - 199
  • [24] In-Memory Parallel Processing of Massive Remotely Sensed Data Using an Apache Spark on Hadoop YARN Model
    Huang, Wei
    Meng, Lingkui
    Zhang, Dongying
    Zhang, Wen
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (01) : 3 - 19
  • [25] Efficient and Scalable Graph Parallel Processing With Symbolic Execution
    Zheng, Long
    Liao, Xiaofei
    Jin, Hai
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2018, 15 (01)
  • [26] ScalaGraph: A Scalable Accelerator for Massively Parallel Graph Processing
    Yao, Pengcheng
    Zheng, Long
    Huang, Yu
    Wang, Qinggang
    Gui, Chuangyi
    Zeng, Zhen
    Liao, Xiaofei
    Jin, Hai
    Xue, Jingling
    2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), 2022, : 199 - 212
  • [27] GraphMap: scalable iterative graph processing using NoSQL
    Goswami, Sayan
    Pokhrel, Ayam
    Lee, Kisung
    Liu, Ling
    Zhang, Qi
    Zhou, Yang
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (09) : 6619 - 6647
  • [28] GraphMap: scalable iterative graph processing using NoSQL
    Sayan Goswami
    Ayam Pokhrel
    Kisung Lee
    Ling Liu
    Qi Zhang
    Yang Zhou
    The Journal of Supercomputing, 2020, 76 : 6619 - 6647
  • [29] Spark-based adaptive Mapreduce data processing method for remote sensing imagery
    Tan, Xicheng
    Di, Liping
    Zhong, Yanfei
    Yao, Yayu
    Sun, Ziheng
    Ali, Yahya
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2021, 42 (01) : 171 - 187
  • [30] An Overview of Medusa: Simplified Graph Processing on GPUs
    Zhong, Jianlong
    He, Bingsheng
    ACM SIGPLAN NOTICES, 2012, 47 (08) : 283 - 284