Performance Portability Study of Epistasis Detection using SYCL on NVIDIA GPU

被引:4
作者
Jin, Zheming [1 ]
Vetter, Jeffrey S. [1 ]
机构
[1] Oak Ridge Natl Lab, POB 2008, Oak Ridge, TN 37830 USA
来源
13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022 | 2022年
关键词
portability; programming model; GPU; epistasis;
D O I
10.1145/3535508.3545591
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe the experience of converting a CUDA implementation of a high-order epistasis detection algorithm to SYCL. The goals are for our work to be useful to application and compiler developers with a detailed description of migration paths between CUDA and SYCL. Evaluating the CUDA and SYCL applications on an NVIDIA V100 GPU, we find that the optimization of loop unrolling needs to be applied manually to the SYCL kernel for obtaining comparable performance. The performance of the SYCL group reduce function, an alternative to the CUDA warp-based reduction, depends on the problem and work group sizes. The 64-bit popcount operation implemented with tree of adders is slightly faster than the built-in popcount operation. When the number of OpenMP threads is four, the highest performance of the SYCL and CUDA applications are comparable.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Performance of dynamic texture segmentation using GPU
    Gomez Fernandez, Francisco
    Elena Buemi, Maria
    Manuel Rodriguez, Juan
    Jacobo-Berlles, Julio C.
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2016, 11 (02) : 375 - 383
  • [42] Improving performance of SYCL applications on CPU architectures using LLVM-directed compilation flow
    Ghiglio, Pietro
    Dolinsky, Uwe
    Goli, Mehdi
    Narasimhan, Kumudha
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (27)
  • [43] Improving performance of SYCL applications on CPU architectures using LLVM-directed compilation flow
    Ghiglio, Pietro
    Dolinsky, Uwe
    Goli, Mehdi
    Narasimhan, Kumudha
    PROCEEDINGS OF THE THIRTEENTH INTERNATIONAL WORKSHOP ON PROGRAMMING MODELS AND APPLICATIONS FOR MULTICORES AND MANYCORES (PMAM '22), 2022, : 1 - 10
  • [44] Epistasis detection using a permutation-based gradient boosting machine
    Che, Kai
    Liu, Xiaoyan
    Guo, Maozu
    Zhang, Junwei
    Wang, Lei
    Zhang, Yin
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1247 - 1252
  • [45] Accelerating the performance of Sequence Alignment using High Performance Multicore GPU
    Kaur, Karamjeet
    Chakraborty, Sudeshna
    Singh, Sanika
    Gupta, Manoj Kumar
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND KNOWLEDGE ECONOMY (ICCIKE' 2019), 2019, : 478 - 481
  • [46] Comparative Study on Face Detection by GPU, CPU and OpenCV
    Patidar, Sanjay
    Singh, Upendra
    Patidar, Ashish
    Munsoori, Riyaz Ali
    Patidar, Jyoti
    SECOND INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGIES, ICCNCT 2019, 2020, 44 : 686 - 696
  • [47] An Approach of Epistasis Detection Using Integer Linear Programming Optimizing Bayesian Network
    Yang, Xuan
    Yang, Chen
    Lei, Jimeng
    Liu, Jianxiao
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (05) : 2654 - 2671
  • [48] Using the integrated GPU to improve CPU sort performance
    Lupescu, Grigore
    Slusanschi, Emil-Ioan
    Tapus, Nicolae
    2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW), 2017, : 39 - 44
  • [49] Performance Optimisation of Parallelized ADAS Applications in FPGA-GPU Heterogeneous Systems: A Case Study With Lane Detection
    Wang, Xiebing
    Huang, Kai
    Knoll, Alois
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2019, 4 (04): : 519 - 531
  • [50] Acceleration of Anomaly Detection in Blockchain Using In-GPU Cache
    Morishima, Shin
    Matsutani, Hiroki
    2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 244 - 251