HPSM: a programming framework to exploit multi-CPU and multi-GPU systems simultaneously

被引:0
|
作者
Ferreira Lima, Joao Vicente [1 ]
Di Domenico, Daniel [1 ]
机构
[1] Univ Fed Santa Maria, Santa Maria, RS, Brazil
关键词
high performance computing; CPU-GPU systems; parallel programming models; high-level framework; parallel loops;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a high-level C++ framework to explore multi-CPU and multi-GPU systems called HPSM. HPSM enables execution of parallel loops and reductions simultaneously over CPUs and GPUs using three parallel backends: Serial, OpenMP, and StarPU. We analysed HPSM development effort with AXPY program through two standard metrics (NCLOC and ES). In addition, we evaluated performance and energy with three parallel benchmarks: N-Body, Hotspot, and CFD solver. HPSM reduced code effort by up to 56.9% compared to StarPU C interface, although it resulted in 2.5x more lines of code compared to OpenMP. The CPU-GPU combination attained speedup results with Hotspot of up to 92.7x on a X86-based system with four GPUs and up to 108.2x on an IBM POWER8+ system with two GPUs. On both systems, the addition of GPUs improved energy efficiency.
引用
收藏
页码:201 / 211
页数:11
相关论文
共 32 条
  • [11] Multi-GPU UNRES for scalable coarse-grained simulations of very large protein systems
    Ocetkiewicz, Krzysztof M.
    Czaplewski, Cezary
    Krawczyk, Henryk
    Lipska, Agnieszka G.
    Liwo, Adam
    Proficz, Jerzy
    Sieradzan, Adam K.
    Czarnul, Pawel
    COMPUTER PHYSICS COMMUNICATIONS, 2024, 298
  • [12] Optimization in the parallelism extraction algorithm with spanning tree on a multi-GPU environment
    Wang, Guyue
    Wada, Koichi
    Yamagiwa, Shinichi
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2019, 14 (06) : 862 - 869
  • [13] Monte Carlo Optimisation Auto-Tuning on a Multi-GPU Cluster
    Paukste, Andrius
    2012 2ND IEEE INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2012, : 894 - 898
  • [14] Techniques for Mapping Synthetic Aperture Radar Processing Algorithms to Multi-GPU Clusters
    Hayden, Eric
    Schmalz, Mark
    Chapman, William
    Ranka, Sanjay
    Sahni, Sartaj
    Seetharaman, Gunasekaran
    2012 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2012, : 13 - 18
  • [15] Granular layEr Simulator: Design and Multi-GPU Simulation of the Cerebellar Granular Layer
    Florimbi, Giordana
    Torti, Emanuele
    Masoli, Stefano
    D'Angelo, Egidio
    Leporati, Francesco
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2021, 15
  • [16] Effective extraction of accurate reduced order models for HF-ICs using multi-CPU architectures
    Lazar, Ioan-Alexandru
    Ciuprina, Gabriela
    Ioan, Daniel
    INVERSE PROBLEMS IN SCIENCE AND ENGINEERING, 2012, 20 (01) : 15 - 27
  • [17] Multi-GPU 3-D Reverse Time Migration with Minimum I/O
    Barbosa, Carlos H. S.
    Coutinho, Alvaro L. G. A.
    HIGH PERFORMANCE COMPUTING, CARLA 2022, 2022, 1660 : 160 - 173
  • [18] Porting the MPI-parallelised LES model PALM to multi-GPU systems and many integrated core processors - an experience report
    Knoop, Helge
    Gronemeier, Tobias
    Suehring, Matthias
    Steinbach, Peter
    Noack, Matthias
    Wende, Florian
    Steinke, Thomas
    Knigge, Christoph
    Raasch, Siegfried
    Ketelsen, Klaus
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2018, 17 (03) : 297 - 309
  • [19] Improving the Performance of Cardiac Simulations in a Multi-GPU Architecture Using a Coalesced Data and Kernel Scheme
    Cordeiro, Raphael Pereira
    Oliveira, Rafael Sachetto
    dos Santos, Rodrigo Weber
    Lobosco, Marcelo
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2016, 2016, 10048 : 546 - 553
  • [20] Towards the Simulation of a Realistic Large-Scale Spiking Network on a Desktop Multi-GPU System
    Torti, Emanuele
    Florimbi, Giordana
    Dorici, Arianna
    Danese, Giovanni
    Leporati, Francesco
    BIOENGINEERING-BASEL, 2022, 9 (10):