CPU-GPU Tuning for Modern Scientific Applications using Node-Level Heterogeneity

被引:0
|
作者
Thavappiragasam, Mathialakan [1 ]
Kale, Vivek [2 ]
机构
[1] Argonne Natl Lab, Lemont, IL 60439 USA
[2] Sandia Natl Labs, Livermore, CA USA
来源
2023 IEEE 30TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC 2023 | 2023年
关键词
inter-device concurrency; performance tuning; CUDA; OpenMP; supercomputer; GPU; CPU; workflows; AI/ML;
D O I
10.1109/HiPC58850.2023.00034
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific applications must be tuned for performance to run efficiently on supercomputers having nodes with a CPU (or, a general-purpose host processor) and GPUs (or, accelerator device processors). Conventional wisdom suggests focusing tuning of applications for a GPU and making the CPU only have the role of offloading computation to the GPU, given the CPU's relatively miniscule amount of computational power. However, this is overly conservative for modern scientific applications, which include those using scientific workflows with real-time data constraints and AI/ML with low numerical precision requirements. This work identifies new performance opportunities for modern scientific applications via CPU-GPU tuning, a strategy that unifies and integrates tuning of the CPU and GPU performance parameters. Applying CPU-GPU tuning to a dot product representative of these applications run on the widely-used Summit supercomputer results in up to an 8.15x speedup. These results provide groundwork for auto-tuning software for applications run on supercomputers having node-level heterogeneity.
引用
收藏
页码:179 / 183
页数:5
相关论文
共 31 条
  • [11] On the Efficiency of Supernodal Factorization in Interior-Point Method Using CPU-GPU Collaboration
    Shah, Usman Ali
    Yousaf, Suhail
    Ahmad, Iftikhar
    Ahmad, Muhammad Ovais
    IEEE ACCESS, 2020, 8 : 120892 - 120904
  • [12] Benchmarking data and compute intensive applications on modern CPU and GPU architectures
    Ciznicki, Milosz
    Kierzynka, Michal
    Kopta, Piotr
    Kurowski, Krzysztof
    Gepner, Pawel
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 1900 - 1909
  • [13] Accelerating Cross-Matching Operation of Geospatial Datasets using a CPU-GPU Hybrid Platform
    Gao, Chao
    Baig, Furqan
    Hoang Vo
    Zhu, Yangyang
    Wang, Fusheng
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 3402 - 3411
  • [14] Massively Parallel Monte Carlo Sampling for Xinanjiang Hydrological Model Parameter Optimization Using CPU-GPU Computer Cluster
    Kan, Guangyuan
    Li, Chenliang
    Zuo, Depeng
    Fu, Xiaodi
    Liang, Ke
    WATER, 2023, 15 (15)
  • [15] Accelerating compute intensive medical imaging segmentation algorithms using hybrid CPU-GPU implementations
    Mohammad A. Alsmirat
    Yaser Jararweh
    Mahmoud Al-Ayyoub
    Mohammed A. Shehab
    Brij B. Gupta
    Multimedia Tools and Applications, 2017, 76 : 3537 - 3555
  • [16] A scalable multiple pairwise protein sequence alignment acceleration using hybrid CPU-GPU approach
    Alawneh, Luay
    Shehab, Mohammed A.
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    Al-Sharif, Ziad A.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (04): : 2677 - 2688
  • [17] Accelerating compute intensive medical imaging segmentation algorithms using hybrid CPU-GPU implementations
    Alsmirat, Mohammad A.
    Jararweh, Yaser
    Al-Ayyoub, Mahmoud
    Shehab, Mohammed A.
    Gupta, Brij B.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) : 3537 - 3555
  • [18] Balancing of Web Applications Workload Using Hybrid Computing (CPU–GPU) Architecture
    Chandrashekhar B.N.
    Kantharaju V.
    Harish Kumar N.
    Kumble L.
    SN Computer Science, 5 (1)
  • [19] FAST TRACKING OF CATHETERS IN 2D FLUOROSCOPIC IMAGES USING AN INTEGRATED CPU-GPU FRAMEWORK
    Wu, Wen
    Chen, Terrence
    Strobel, Norbert
    Comaniciu, Dorin
    2012 9TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2012, : 1184 - 1187
  • [20] Event- and Time-Driven Techniques Using Parallel CPU-GPU Co-processing for Spiking Neural Networks
    Naveros, Francisco
    Garrido, Jesus A.
    Carrillo, Richard R.
    Ros, Eduardo
    Luque, Niceto R.
    FRONTIERS IN NEUROINFORMATICS, 2017, 11