Lazy Allocation and Transfer Fusion Optimization for GPU-based Heterogeneous Systems

被引:0
作者
Li, Lu [1 ]
Kessler, Christoph [1 ]
机构
[1] Linkoping Univ, IDA, Linkoping, Sweden
来源
2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018) | 2018年
关键词
adaptive message fusion; GPU; CUDA; lazy memory allocation; memory transfer optimization; PARALLEL;
D O I
10.1109/PDP2018.2018.00054
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present two memory optimization techniques which improve the efficiency of data transfer over PCIe bus for GPU-based heterogeneous systems, namely lazy allocation and transfer fusion optimization. Both are based on merging data transfers so that less overhead is incurred, thereby increasing transfer throughput and making accelerator usage profitable also for smaller operand sizes. We provide the design and prototype implementation of the two techniques in CUDA. Microbench-marking results show that especially for smaller and medium-sized operands significant speedups can be achieved. We also prove that our transfer fusion optimization algorithm is optimal.
引用
收藏
页码:311 / 315
页数:5
相关论文
共 50 条
  • [1] GPU-based coevolutionary particle swarm optimization
    Zhao Liang
    Zhu Yanxing
    Zhang Jianyu
    Ye Zhencheng
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 9883 - 9887
  • [2] GPU-based Electromagnetic Optimization of MIMO Channels
    Breglia, Alfonso
    Capozzoli, Amedeo
    Curcio, Claudio
    Di Donna, Salvatore
    Liseno, Angelo
    APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL, 2018, 33 (02): : 172 - 175
  • [3] GPU-Based Heterogeneous Coding Architecture for HEVC
    Cebrian-Marquez, Gabriel
    Migallon, Hector
    Luis Martinez, Jose
    Lopez-Granado, Otoniel
    Pinol, Pablo
    Cuenca, Pedro
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2016, 2016, 10048 : 529 - 536
  • [4] GPU-Based Memory Optimization Method for Multiple Sequence Alignment
    Jin, Lizhong
    ISBE 2011: 2011 INTERNATIONAL CONFERENCE ON BIOMEDICINE AND ENGINEERING, VOL 4, 2011, : 36 - 39
  • [5] GPU-Based Parallelization for Fast Circuit Optimization
    Liu, Yifang
    Hu, Jiang
    DAC: 2009 46TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2009, : 943 - 946
  • [6] The Integration of GPU-based and Heterogeneous Devices using HLA
    Andrade, Halamo G. R.
    Morais, Daniel
    Silva, Thiago W. B.
    Nascimento, Tiago P.
    Brito, Alisson V.
    2016 VI BRAZILIAN SYMPOSIUM ON COMPUTING SYSTEMS ENGINEERING (SBESC 2016), 2016, : 162 - 167
  • [7] A GPU-based Parallel Fireworks Algorithm for Optimization
    Ding, Ke
    Zheng, Shaoqiu
    Tan, Ying
    GECCO'13: PROCEEDINGS OF THE 2013 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2013, : 9 - 16
  • [8] Automatic and Portable Mapping of Data Parallel Programs to OpenCL for GPU-Based Heterogeneous Systems
    Wang, Zheng
    Grewe, Dominik
    O'Boyle, Michael F. P.
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2014, 11 (04)
  • [9] A GPU-based Heterogeneous Computing Method to Speed up Wireless Channel Simulation
    Yan, Kangning
    Zhang, Nianzu
    Jiang, Zhengbo
    Sheng, Yu
    Gao, Yiting
    2022 INTERNATIONAL CONFERENCE ON MICROWAVE AND MILLIMETER WAVE TECHNOLOGY (ICMMT), 2022,
  • [10] GPU-based cooperative coevolution for large-scale global optimization
    Ali Kelkawi
    Mohammed El-Abd
    Imtiaz Ahmad
    Neural Computing and Applications, 2023, 35 : 4621 - 4642