An Overview of Hadoop MapReduce, Spark, and Scalable Graph Processing Architecture

被引:2
作者
Talan, Pooja P. [1 ]
Sharma, Kartik U. [1 ]
Nawade, Pratiksha P. [1 ]
Talan, Karishma P. [2 ]
机构
[1] Prof Ram Meghe Coll Engn & Management, Badnera Amravati, India
[2] KPIT Technol Ltd, Mumbai, Maharashtra, India
来源
RECENT DEVELOPMENTS IN MACHINE LEARNING AND DATA ANALYTICS | 2019年 / 740卷
关键词
Big Data; Hadoop MapReduce; Apache Spark; Graph processing;
D O I
10.1007/978-981-13-1280-9_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In today's technology era, Big Data has become a buzzword. Various frameworks are available in order to process this Big Data. Both Hadoop and Spark are open source framework to process Big Data. Hadoop provides batch processing while Spark supports both batch as well as stream processing, i.e., it is a hybrid processing framework. Both frameworks have their own advantages and drawback. The contribution of this paper is to provide a comparative analysis of Hadoop MapReduce and Apache Spark. In this paper, we also propose a scalable graph processing architecture that could be used to overcome traditional limitations of Hadoop framework.
引用
收藏
页码:35 / 42
页数:8
相关论文
共 50 条
  • [31] Research on the Processing of Sensor Information for the Internet of Things Under the Hadoop Architecture
    Zong, Feng
    FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY V, 2015, : 320 - 325
  • [32] Performance Evaluation of a MapReduce Hadoop-based Implementation for Processing Large Virtual Campus Log Files
    Xhafa, Fatos
    Garcia, Daniel
    Ramirez, Daniel
    Caballe, Santi
    2015 10TH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC), 2015, : 200 - 206
  • [33] Using Apache Spark on genome assembly for scalable overlap-graph reduction
    Paul, Alexander J.
    Lawrence, Dylan
    Song, Myoungkyu
    Lim, Seung-Hwan
    Pan, Chongle
    Ahn, Tae-Hyuk
    HUMAN GENOMICS, 2019, 13 (Suppl 1) : 48
  • [34] Using Apache Spark on genome assembly for scalable overlap-graph reduction
    Alexander J. Paul
    Dylan Lawrence
    Myoungkyu Song
    Seung-Hwan Lim
    Chongle Pan
    Tae-Hyuk Ahn
    Human Genomics, 13
  • [35] Performance Analysis of Matrix and Graph Computations using Data Compression Techniques in MPI and Hadoop MapReduce in Big Data Framework
    Ramakrishnaiah, Nagendla
    Reddy, Sirigiri Konda
    2017 IEEE INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES AND MANAGEMENT FOR COMPUTING, COMMUNICATION, CONTROLS, ENERGY AND MATERIALS (ICSTM), 2017, : 54 - 62
  • [36] Architecture of Geospatial Big-Data Batch Processing Model Based on Hadoop
    Kim, Sang-Su
    Yu, Sung-Hwan
    2015 INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC), 2015, : 964 - 966
  • [37] Real-Time Big Data Stream Processing Using GPU with Spark Over Hadoop Ecosystem
    Rathore, M. Mazhar
    Son, Hojae
    Ahmad, Awais
    Paul, Anand
    Jeon, Gwanggil
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (03) : 630 - 646
  • [38] Real-Time Big Data Stream Processing Using GPU with Spark Over Hadoop Ecosystem
    M. Mazhar Rathore
    Hojae Son
    Awais Ahmad
    Anand Paul
    Gwanggil Jeon
    International Journal of Parallel Programming, 2018, 46 : 630 - 646
  • [39] Design of big data processing system architecture based on Hadoop Under the cloud computing
    Duan, Chunmei
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 6302 - 6306
  • [40] GTS: A Fast and Scalable Graph Processing Method based on Streaming Topology to GPUs
    Kim, Min-Soo
    An, Kyuhyeon
    Park, Himchan
    Seo, Hyunseok
    Kim, Jinwook
    SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 447 - 461