Remote Execution of OpenCL and SYCL Applications via rOpenCL

被引：0

作者：

Alves, Rui ^{[1
]}

Rutin, Jose ^{[2
]}

机构：

[1] Inst Politecn Braganca, Campus Santa Apolonia, P-5300253 Braganca, Portugal

[2] Inst Politecn Braganca, Res Ctr Digitalizat & Intelligent Robot CeDRI, Lab Sustentabil & Tecnol Reg Montanha SusTEC, Campus Santa Apolonia, P-5300253 Braganca, Portugal

来源：

2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW | 2023年

关键词：

HPC; Heterogeneous Computing; API Forwarders; OpenCL; SYCL;

D O I：

10.1109/IPDPSW59300.2023.00020

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the increasing computational demands of modern applications, heterogeneous systems continue to have an important role in accelerating computationally intensive tasks, a trend confirmed by the most recent HPC architectures. Efficiently exploiting these systems implies the use of specific programming paradigms, such as the classic OpenCL model, or modern single-source alternatives, like SYCL. However, the original execution model of these approaches does not provision for the use of coprocessors other than those directly attached to the host system where the heterogeneous application starts. Over time, several solutions emerged to cope with this limitation, both at the hardware and software level, allowing to exploit remote/distributed co-processors. In this paper, a representative set of seminal OpenCL API Forwarders is revisited and their performance compared with rOpenCL (a recently introduced platform of the same kind), using the classical matrix multiplication case study. In addition, given the importance of SYCL, which has been steadily gaining traction, this paper also exploits the potential of rOpenCL in supporting SYCL applications that use remote accelerators. To that end, another set of benchmarks is used, with both OpenCL and SYCL implementations, allowing not only to gather insight into the performance trade-offs of local versus remote (via rOpenCL) execution, but also about the current performance differential between the two programming models.

引用

页码：51 / 60

页数：10

共 50 条

[21] Execution of compound multi-kernel OpenCL computations in multi-CPU/multi-GPU environments
Soldado, Fabio
Alexandre, Fernando
Paulino, Herve
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (03) : 768 - 787
[22] Auto-tuned OpenCL kernel co-execution in OmpSs for heterogeneous systems
Perez, B.
Stafford, E.
Bosque, J. L.
Beivide, R.
Mateo, S.
Teruel, X.
Martorell, X.
Ayguade, E.
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 125 (45-57) : 45 - 57
[23] Revisiting Thread Execution Methods for GPU-oriented OpenCL Programs on Multicore Processors
Miyazaki, Takafumi
Hidari, Hayato
Hojo, Naohisa
Taniguchi, Ittetsu
Tomiyama, Hiroyuki
2018 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS (CANDARW 2018), 2018, : 520 - 523
[24] Extensions over OpenCL for latency reduction and critical applications
Lupescu, Grigore
Slusanschi, Emil-Ioan
Tapus, Nicolae
2015 17TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC), 2016, : 379 - 385
[25] Efficient FPGA Implementation of OpenCL High-Performance Computing Applications via High-Level Synthesis
Bin Muslim, Fahad
Ma, Liang
Roozmeh, Mehdi
Lavagno, Luciano
IEEE ACCESS, 2017, 5 : 2747 - 2762
[26] Adapting SYCL's SIMT Programming Paradigm for Accelerators via Program Reconstruction
Wang, Jiashu
Deng, Xun
Wang, Kai-Ting Amy
Ye, Zichun
50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOP PROCEEDINGS - ICPP WORKSHOPS '21, 2021,
[27] Crane: Fast and Migratable GPU Passthrough for OpenCL Applications
Gleeson, James
Kats, Daniel
Mei, Charlie
de Lara, Eyal
SYSTOR'17: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE, 2017,
[28] Methods for Optimizing OpenCL Applications on Heterogeneous Multicore Architectures
Chu, Slo-Li
Hsiao, Chih-Chieh
APPLIED MATHEMATICS & INFORMATION SCIENCES, 2013, 7 (06): : 2549 - 2562
[29] Optimizing GPU Code for CPU Execution Using OpenCL and Vectorization: A Case Study on Image Coding
Pereira, Pedro M. M.
Domingues, Patricio
Rodrigues, Nuno M. M.
Falcao, Gabriel
de Faria, Sergio M. M.
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2016, 2016, 10048 : 537 - 545
[30] Noise Removal from Remote Sensed Images by NonLocal Means with OpenCL Algorithm
Granata, Donatella
Palombo, Angelo
Santini, Federico
Amato, Umberto
REMOTE SENSING, 2020, 12 (03)

← 1 2 3 4 5 →