Remote Execution of OpenCL and SYCL Applications via rOpenCL

被引:0
作者
Alves, Rui [1 ]
Rutin, Jose [2 ]
机构
[1] Inst Politecn Braganca, Campus Santa Apolonia, P-5300253 Braganca, Portugal
[2] Inst Politecn Braganca, Res Ctr Digitalizat & Intelligent Robot CeDRI, Lab Sustentabil & Tecnol Reg Montanha SusTEC, Campus Santa Apolonia, P-5300253 Braganca, Portugal
来源
2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW | 2023年
关键词
HPC; Heterogeneous Computing; API Forwarders; OpenCL; SYCL;
D O I
10.1109/IPDPSW59300.2023.00020
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the increasing computational demands of modern applications, heterogeneous systems continue to have an important role in accelerating computationally intensive tasks, a trend confirmed by the most recent HPC architectures. Efficiently exploiting these systems implies the use of specific programming paradigms, such as the classic OpenCL model, or modern single-source alternatives, like SYCL. However, the original execution model of these approaches does not provision for the use of coprocessors other than those directly attached to the host system where the heterogeneous application starts. Over time, several solutions emerged to cope with this limitation, both at the hardware and software level, allowing to exploit remote/distributed co-processors. In this paper, a representative set of seminal OpenCL API Forwarders is revisited and their performance compared with rOpenCL (a recently introduced platform of the same kind), using the classical matrix multiplication case study. In addition, given the importance of SYCL, which has been steadily gaining traction, this paper also exploits the potential of rOpenCL in supporting SYCL applications that use remote accelerators. To that end, another set of benchmarks is used, with both OpenCL and SYCL implementations, allowing not only to gather insight into the performance trade-offs of local versus remote (via rOpenCL) execution, but also about the current performance differential between the two programming models.
引用
收藏
页码:51 / 60
页数:10
相关论文
共 50 条
  • [21] Execution of compound multi-kernel OpenCL computations in multi-CPU/multi-GPU environments
    Soldado, Fabio
    Alexandre, Fernando
    Paulino, Herve
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (03) : 768 - 787
  • [22] Auto-tuned OpenCL kernel co-execution in OmpSs for heterogeneous systems
    Perez, B.
    Stafford, E.
    Bosque, J. L.
    Beivide, R.
    Mateo, S.
    Teruel, X.
    Martorell, X.
    Ayguade, E.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 125 (45-57) : 45 - 57
  • [23] Revisiting Thread Execution Methods for GPU-oriented OpenCL Programs on Multicore Processors
    Miyazaki, Takafumi
    Hidari, Hayato
    Hojo, Naohisa
    Taniguchi, Ittetsu
    Tomiyama, Hiroyuki
    2018 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS (CANDARW 2018), 2018, : 520 - 523
  • [24] Extensions over OpenCL for latency reduction and critical applications
    Lupescu, Grigore
    Slusanschi, Emil-Ioan
    Tapus, Nicolae
    2015 17TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC), 2016, : 379 - 385
  • [25] Efficient FPGA Implementation of OpenCL High-Performance Computing Applications via High-Level Synthesis
    Bin Muslim, Fahad
    Ma, Liang
    Roozmeh, Mehdi
    Lavagno, Luciano
    IEEE ACCESS, 2017, 5 : 2747 - 2762
  • [26] Adapting SYCL's SIMT Programming Paradigm for Accelerators via Program Reconstruction
    Wang, Jiashu
    Deng, Xun
    Wang, Kai-Ting Amy
    Ye, Zichun
    50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOP PROCEEDINGS - ICPP WORKSHOPS '21, 2021,
  • [27] Crane: Fast and Migratable GPU Passthrough for OpenCL Applications
    Gleeson, James
    Kats, Daniel
    Mei, Charlie
    de Lara, Eyal
    SYSTOR'17: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE, 2017,
  • [28] Methods for Optimizing OpenCL Applications on Heterogeneous Multicore Architectures
    Chu, Slo-Li
    Hsiao, Chih-Chieh
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2013, 7 (06): : 2549 - 2562
  • [29] Optimizing GPU Code for CPU Execution Using OpenCL and Vectorization: A Case Study on Image Coding
    Pereira, Pedro M. M.
    Domingues, Patricio
    Rodrigues, Nuno M. M.
    Falcao, Gabriel
    de Faria, Sergio M. M.
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2016, 2016, 10048 : 537 - 545
  • [30] Noise Removal from Remote Sensed Images by NonLocal Means with OpenCL Algorithm
    Granata, Donatella
    Palombo, Angelo
    Santini, Federico
    Amato, Umberto
    REMOTE SENSING, 2020, 12 (03)