Performance Portability Study of Epistasis Detection using SYCL on NVIDIA GPU

被引：4

作者：

Jin, Zheming ^{[1
]}

Vetter, Jeffrey S. ^{[1
]}

机构：

[1] Oak Ridge Natl Lab, POB 2008, Oak Ridge, TN 37830 USA

来源：

13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022 | 2022年

关键词：

portability; programming model; GPU; epistasis;

D O I：

10.1145/3535508.3545591

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We describe the experience of converting a CUDA implementation of a high-order epistasis detection algorithm to SYCL. The goals are for our work to be useful to application and compiler developers with a detailed description of migration paths between CUDA and SYCL. Evaluating the CUDA and SYCL applications on an NVIDIA V100 GPU, we find that the optimization of loop unrolling needs to be applied manually to the SYCL kernel for obtaining comparable performance. The performance of the SYCL group reduce function, an alternative to the CUDA warp-based reduction, depends on the problem and work group sizes. The 64-bit popcount operation implemented with tree of adders is slightly faster than the built-in popcount operation. When the number of OpenMP threads is four, the highest performance of the SYCL and CUDA applications are comparable.

引用

页数：8

共 50 条

[41] Performance of dynamic texture segmentation using GPU
Gomez Fernandez, Francisco
Elena Buemi, Maria
Manuel Rodriguez, Juan
Jacobo-Berlles, Julio C.
JOURNAL OF REAL-TIME IMAGE PROCESSING, 2016, 11 (02) : 375 - 383
[42] Improving performance of SYCL applications on CPU architectures using LLVM-directed compilation flow
Ghiglio, Pietro
Dolinsky, Uwe
Goli, Mehdi
Narasimhan, Kumudha
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (27)
[43] Improving performance of SYCL applications on CPU architectures using LLVM-directed compilation flow
Ghiglio, Pietro
Dolinsky, Uwe
Goli, Mehdi
Narasimhan, Kumudha
PROCEEDINGS OF THE THIRTEENTH INTERNATIONAL WORKSHOP ON PROGRAMMING MODELS AND APPLICATIONS FOR MULTICORES AND MANYCORES (PMAM '22), 2022, : 1 - 10
[44] Epistasis detection using a permutation-based gradient boosting machine
Che, Kai
Liu, Xiaoyan
Guo, Maozu
Zhang, Junwei
Wang, Lei
Zhang, Yin
2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1247 - 1252
[45] Accelerating the performance of Sequence Alignment using High Performance Multicore GPU
Kaur, Karamjeet
Chakraborty, Sudeshna
Singh, Sanika
Gupta, Manoj Kumar
PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND KNOWLEDGE ECONOMY (ICCIKE' 2019), 2019, : 478 - 481
[46] Comparative Study on Face Detection by GPU, CPU and OpenCV
Patidar, Sanjay
Singh, Upendra
Patidar, Ashish
Munsoori, Riyaz Ali
Patidar, Jyoti
SECOND INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGIES, ICCNCT 2019, 2020, 44 : 686 - 696
[47] An Approach of Epistasis Detection Using Integer Linear Programming Optimizing Bayesian Network
Yang, Xuan
Yang, Chen
Lei, Jimeng
Liu, Jianxiao
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (05) : 2654 - 2671
[48] Using the integrated GPU to improve CPU sort performance
Lupescu, Grigore
Slusanschi, Emil-Ioan
Tapus, Nicolae
2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW), 2017, : 39 - 44
[49] Performance Optimisation of Parallelized ADAS Applications in FPGA-GPU Heterogeneous Systems: A Case Study With Lane Detection
Wang, Xiebing
Huang, Kai
Knoll, Alois
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2019, 4 (04): : 519 - 531
[50] Acceleration of Anomaly Detection in Blockchain Using In-GPU Cache
Morishima, Shin
Matsutani, Hiroki
2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 244 - 251

← 1 2 3 4 5 →