Investigating Apache Hama: a bulk synchronous parallel computing framework

被引:8
|
作者
Siddique, Kamran [1 ]
Akhtar, Zahid [2 ]
Kim, Yangwoo [1 ]
Jeong, Young-Sik [1 ]
Yoon, Edward J. [3 ]
机构
[1] Dongguk Univ, Seoul, South Korea
[2] Univ Quebec, Montreal, PQ, Canada
[3] Samsung Elect, Seoul, South Korea
关键词
Apache Hama; Bsp; Bulk synchronous parallel; Distributed computing; Mapreduce; Hadoop;
D O I
10.1007/s11227-017-1987-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The quantity of digital data is growing exponentially, and the task to efficiently process such massive data is becoming increasingly challenging. Recently, academia and industry have recognized the limitations of the predominate Hadoop framework in several application domains, such as complex algorithmic computation, graph, and streaming data. Unfortunately, this widely known map-shuffle-reduce paradigm has become a bottleneck to address the challenges of big data trends. The demand for research and development of novel massive computing frameworks is increasing rapidly, and systematic illustration, analysis, and highlights of potential research areas are vital and very much in demand by the researchers in the field. Therefore, we explore one of the emerging and promising distributed computing frameworks, Apache Hama. This is a top level project under the Apache Software Foundation and a pure bulk synchronous parallel model for processing massive scientific computations, e.g. graph, matrix, and network algorithms. The objectives of this contribution are twofold. First, we outline the current state of the art, distinguish the challenges, and frame some research directions for researchers and application developers. Second, we present real-world use cases of Apache Hama to illustrate its potential specifically to the industrial community.
引用
收藏
页码:4190 / 4205
页数:16
相关论文
共 33 条
  • [21] Scaling GMM Expectation Maximization Algorithm Using Bulk Synchronous Parallel Approach
    Ratnaparkhi, Abhay A.
    Pilli, Emmanuel
    Joshi, R. C.
    2015 INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT), 2015, : 558 - 562
  • [22] Bulk-Synchronous Parallel Simultaneous BVH Traversal for Collision Detection on GPUs
    Chitalu, Floyd M.
    Dubach, Christophe
    Komura, Taku
    ACM SIGGRAPH SYMPOSIUM ON INTERACTIVE 3D GRAPHICS AND GAMES (I3D 2018), 2018,
  • [23] A Parallel Sequential SBAS Processing Framework Based on Hadoop Distributed Computing
    Wu, Zhenning
    Lv, Xiaolei
    Yun, Ye
    Duan, Wei
    REMOTE SENSING, 2024, 16 (03)
  • [24] A parallel computing framework for solving user equilibrium problem on computer clusters
    Chen, Xinyuan
    Liu, Zhiyuan
    Kim, Inhi
    TRANSPORTMETRICA A-TRANSPORT SCIENCE, 2020, 16 (03) : 550 - 573
  • [25] MigPF: Towards on self-organizing process rescheduling of Bulk-Synchronous Parallel applications
    Righi, Rodrigo da Rosa
    Gomes, Roberto de Quadros
    Rodrigues, Vinicius Facco
    da Costa, Cristiano Andre
    Alberti, Antonio Marcos
    Pilla, Laercio Lima
    Alexandre Navaux, Philippe Olivier
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 78 : 272 - 286
  • [26] A Framework for Self-managing Database Support and Parallel Computing for Assistive Systems
    Marten, Dennis
    Heuer, Andreas
    8TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS (PETRA 2015), 2015,
  • [27] DPAC: An object-oriented distributed and parallel computing framework for manufacturing applications
    Raghavan, NRS
    Waghmare, T
    IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, 2002, 18 (04): : 431 - 443
  • [28] Simulation for bulk synchronous parallel superstep task assignment in desktop grids characterised by gaussian parameter distributions
    Wilson Garcia, Edscott
    Morales-Luna, Guillermo
    MULTIAGENT AND GRID SYSTEMS, 2008, 4 (02) : 141 - 166
  • [29] Optimizing Urban LiDAR Flight Path Planning Using a Genetic Algorithm and a Dual Parallel Computing Framework
    Vo, Anh Vu
    Laefer, Debra E.
    Byrne, Jonathan
    REMOTE SENSING, 2021, 13 (21)
  • [30] A parallel distributed computing framework for Newton-Raphson load flow analysis of large interconnected power systems
    Kumar, R. Sreerarna
    Chandrasekharan, E.
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2015, 73 : 1 - 6