DHTS: A Dynamic Hybrid Tiling Strategy for Optimizing Stencil Computation on GPUs

被引:0
|
作者
Liu, Song [1 ]
Zhang, Zengyuan [1 ]
Wu, Weiguo [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Comp Sci & Technol, Xian 710049, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Stencil computation; dynamic hybrid tiling; performance;
D O I
10.1109/TC.2023.3271060
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Stencil computation is an important class of computational modes in scientific computing applications. Loop tiling techniques have been widely studied to accelerate stencil computations on different architectures by exploiting parallelism and data locality. Recent advanced tiling methods enable the tile-wise concurrent start-up to improve the execution performance. However, such methods statically partition all dimensions of iteration space into tiles with predetermined complex shapes and sizes, and thus lead to low thread utilization and memory access efficiency on GPUs. In this paper, we present DHTS, a novel dynamic hybrid tiling strategy for stencil computations. DHTS employs static tiling on the outer dimensions to achieve concurrent start-up parallelism, while proposes a dynamic rectangular tiling method on the inner dimensions to improve thread utilization and memory access efficiency. By deriving tile size constraints, DHTS adaptively achieves equal-size workload of tiles, and therefore reducing idle threads and increasing coalesced memory accesses within tiles. We implement the proposed strategy with different complex tile shapes. Experimental results on Titan V and Tesla V100 GPUs show that DHTS effectively improves the execution performance of 2D/3D stencils compared to state-of-the-art tiling methods, and achieves the best improvement of 28x.
引用
收藏
页码:2795 / 2807
页数:13
相关论文
共 15 条
  • [1] TOAST: Automatic tiling for iterative stencil computations on GPUs
    Rocha, Rodrigo C. O.
    Pereira, Alyson D.
    Ramos, Luiz
    Goes, Luis F. W.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (08):
  • [2] Hexagonal Tiling based Multiple FPGAs Stencil Computation Acceleration and Optimization Methodology
    Wang, Jinyu
    Kang, Yifei
    Li, Yiwen
    Wu, Weiguo
    Liu, Song
    Wang, Longxiang
    19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 697 - 705
  • [3] A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs
    Yang, Yang
    Cui, Hui-Min
    Feng, Xiao-Bing
    Xue, Jing-Ling
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2012, 27 (01) : 57 - 74
  • [4] A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs
    Yang Yang
    Hui-Min Cui
    Xiao-Bing Feng
    Jing-Ling Xue
    Journal of Computer Science and Technology, 2012, 27 : 57 - 74
  • [5] A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs
    杨杨
    崔慧敏
    冯晓兵
    薛京灵
    Journal of Computer Science & Technology, 2012, (01) : 57 - 74
  • [6] csTuner: Scalable Auto-tuning Framework for Complex Stencil Computation on GPUs
    Sun, Qingxiao
    Liu, Yi
    Yang, Hailong
    Jiang, Zhonghui
    Liu, Xiaoyan
    Dun, Ming
    Luan, Zhongzhi
    Qian, Depei
    2021 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2021), 2021, : 192 - 203
  • [7] A Parallel Optimization Method for Stencil Computation on the Domain that is Bigger than Memory Capacity of GPUs
    Jin, Guanghao
    Endo, Toshio
    Matsuoka, Satoshi
    2013 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2013,
  • [8] A Multi-level Optimization Strategy to Improve the Performance of Stencil Computation
    Sornet, Gauthier
    Dupros, Fabrice
    Jubertie, Sylvain
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017), 2017, 108 : 1083 - 1092
  • [9] Low-effort task distribution of stencil computation on heterogeneous multi-GPUs: simulating graphene superlattices
    Rodrigues, M.
    Fernandes, D.
    Silveirinha, M.
    Falcao, G.
    2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 575 - 578
  • [10] Development of a Dynamic Hitch Lift Controller using a Hybrid Control Strategy in A Heavy Combination Vehicle
    Manaf, M. Z. Abdul
    Bakar, S. A. A.
    Hudha, K.
    Samin, P. M.
    INTERNATIONAL JOURNAL OF AUTOMOTIVE AND MECHANICAL ENGINEERING, 2024, 21 (01) : 11099 - 11124