Big Data Analytics in Java']Java with PCJ Library: Performance Comparison with Hadoop

被引:4
|
作者
Nowicki, Marek [1 ]
Ryczkowska, Magdalena [1 ,2 ]
Gorski, Lukasz [1 ,2 ]
Bala, Piotr [2 ]
机构
[1] Nicolaus Copernicus Univ, Fac Math & Comp Sci, Chopina 12-18, PL-87100 Torun, Poland
[2] Univ Warsaw, Interdisciplinary Ctr Math & Computat Modeling, Pawinskiego 5a, PL-02106 Warsaw, Poland
来源
PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2017), PT II | 2018年 / 10778卷
关键词
Big Data; !text type='Java']Java[!/text; Parallel computing; Hadoop;
D O I
10.1007/978-3-319-78054-2_30
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The focus of this article is to present Big Data analytics using Java and PCJ library. The PCJ library is an award-winning library for development of parallel codes using PGAS programming paradigm. The PCJ can be used for easy implementation of the different algorithms, including ones used for Big Data processing. In this paper, we present performance results for standard benchmarks covering different types of applications from computational intensive, through traditional mapreduce up to communication intensive. The performance is compared to one achieved on the same hardware but using Hadoop. The PCJ implementation has been used with both local file system and HDFS. The code written with the PCJ can be developed much faster as it requires a smaller number of libraries used. Our results show that applications developed with the PCJ library are much faster compare to Hadoop implementation.
引用
收藏
页码:318 / 327
页数:10
相关论文
共 50 条
  • [1] PCJ - Java']Java Library for Highly Scalable HPC and Big Data Processing
    Nowicki, Marek
    Gorski, Lukasz
    Bala, Piotr
    PROCEEDINGS 2018 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2018, : 12 - 20
  • [2] Comparison of the HPC and Big Data Java']Java Libraries Spark, PCJ and APGAS
    Posner, Jonas
    Reitz, Lukas
    Fohry, Claudia
    PROCEEDINGS OF PAW-ATM18: 2018 IEEE/ACM PARALLEL APPLICATIONS WORKSHOP, ALTERNATIVES TO MPI (PAW-ATM), 2018, : 11 - 22
  • [3] PCJ Java']Java library as a solution to integrate HPC, Big Data and Artificial Intelligence workloads
    Nowicki, Marek
    Gorski, Lukasz
    Bala, Piotr
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [4] PCJ - Java']Java library for high performance computing in PGAS model
    Nowicki, Marekno
    Gorski, Lukasz
    Grabrczyk, Patryk
    Bala, Piotr
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 202 - 209
  • [5] PCJ Java library as a solution to integrate HPC, Big Data and Artificial Intelligence workloads
    Marek Nowicki
    Łukasz Górski
    Piotr Bała
    Journal of Big Data, 8
  • [6] Parallel Differential Evolution in the PGAS Programming Model Implemented with PCJ Java']Java Library
    Gorski, Lukasz
    Rakowski, Franciszek
    Bala, Piotr
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, PPAM 2015, PT I, 2016, 9573 : 448 - 458
  • [7] Performance Evaluation of Java']Java/PCJ Implementation of Parallel Algorithms on the Cloud
    Nowicki, Marek
    Gorski, Lukasz
    Bala, Piotr
    EURO-PAR 2020: PARALLEL PROCESSING WORKSHOPS, 2021, 12480 : 213 - 224
  • [8] Level-synchronous BFS algorithm implemented in Java']Java using PCJ Library
    Ryczkowska, Magdalena
    Nowicki, Marek
    Bala, Piotr
    2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE & COMPUTATIONAL INTELLIGENCE (CSCI), 2016, : 596 - 601
  • [9] Fault-Tolerance Mechanisms for the Java']Java Parallel Codes Implemented with the PCJ Library
    Szynkiewicz, Michal
    Nowicki, Marek
    PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2017), PT II, 2018, 10778 : 298 - 307
  • [10] Evaluation of the Parallel Performance of the Java']Java and PCJ on the Intel KNL Based Systems
    Nowicki, Marek
    Gorski, Lukasz
    Bala, Piotr
    PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2017), PT II, 2018, 10778 : 288 - 297