An Overview of Hadoop MapReduce, Spark, and Scalable Graph Processing Architecture

被引:2
作者
Talan, Pooja P. [1 ]
Sharma, Kartik U. [1 ]
Nawade, Pratiksha P. [1 ]
Talan, Karishma P. [2 ]
机构
[1] Prof Ram Meghe Coll Engn & Management, Badnera Amravati, India
[2] KPIT Technol Ltd, Mumbai, Maharashtra, India
来源
RECENT DEVELOPMENTS IN MACHINE LEARNING AND DATA ANALYTICS | 2019年 / 740卷
关键词
Big Data; Hadoop MapReduce; Apache Spark; Graph processing;
D O I
10.1007/978-981-13-1280-9_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In today's technology era, Big Data has become a buzzword. Various frameworks are available in order to process this Big Data. Both Hadoop and Spark are open source framework to process Big Data. Hadoop provides batch processing while Spark supports both batch as well as stream processing, i.e., it is a hybrid processing framework. Both frameworks have their own advantages and drawback. The contribution of this paper is to provide a comparative analysis of Hadoop MapReduce and Apache Spark. In this paper, we also propose a scalable graph processing architecture that could be used to overcome traditional limitations of Hadoop framework.
引用
收藏
页码:35 / 42
页数:8
相关论文
共 50 条
  • [41] SORA: Scalable Overlap-graph Reduction Algorithms for Genome Assembly using Apache Spark in the Cloud
    Paul, Alexander J.
    Lawrence, Dylan
    Song, Myoungkyu
    Lim, Seung-Hwan
    Pan, Chongle
    Ahn, Tae-Hyuk
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 718 - 723
  • [42] GraFF: A Multi-FPGA System with Memory Semantic Fabric for Scalable Graph Processing
    Zhang, Xu
    Chang, Yisong
    Lu, Tianyue
    Liu, Ke
    Zhang, Ke
    Chen, Mingyu
    2022 21ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2022), 2022, : 308 - 309
  • [43] SGgraph: A Scalable GPU-Based Edge-Centric Graph Processing Framework
    Yakhlef, Ala Eddine
    Yahiaoui, Said
    Bendjoudi, Ahcene
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2025, 53 (03)
  • [44] Efficient Processing of Large-Scale Medical Data in IoT: A Hybrid Hadoop-Spark Approach for Health Status Prediction
    Yu, Lina
    Su, Wenlong
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (01) : 74 - 86
  • [45] Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee
    Zhuo, Youwei
    Chen, Jingji
    Rao, Gengyu
    Luo, Qinyi
    Wang, Yanzhi
    Yang, Hailong
    Qian, Depei
    Qian, Xuehai
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2021, 37 (1-4):
  • [46] GraphA: An efficient ReRAM-based architecture to accelerate large scale graph processing
    Ghasemi, Seyed Ali
    Jahannia, Belal
    Farbeh, Hamed
    JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 133
  • [47] Advanced Control Distributed Processing Architecture (ACDPA) using SDN and Hadoop for Identifying the Flow Characteristics and Setting the Quality of Service(QoS) in the Network
    Desai, Abhijeet
    Nagegowda, K. S.
    2015 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2015, : 784 - 788
  • [48] A Scalable Computing Resources System for Remote Sensing Big Data Processing Using GeoPySpark Based on Spark on K8s
    Guo, Jifu
    Huang, Chunlin
    Hou, Jinliang
    REMOTE SENSING, 2022, 14 (03)
  • [49] Real-time Processing of IoT Events using a Software as a Service (SaaS) Architecture with Graph Database
    D'silva, Godson Michael
    Thakare, Sanket
    Bharadi, Vinayak Ashok
    2016 INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2016,
  • [50] SoGraph: A State-Aware Architecture for Out-of-Memory Graph Processing on HBM-Equipped FPGAs
    Cheng, Qi Yu
    Zheng, Zhendong
    Jiang, Tianhao
    Tang, Cheng
    Wang, Teng
    Gong, Lei
    Wang, Chao
    Zhou, Xuehai
    2024 34TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL 2024, 2024, : 87 - 91