Efficient fine-grained shared buffer management for multiple OpenCL devices

被引:1
作者
Xun, Chang-qing [1 ,2 ]
Chen, Dong [1 ,2 ]
Lan, Qiang [1 ,2 ]
Zhang, Chun-yuan [1 ,2 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, State Key Lab High Performance Comp, Changsha 410073, Hunan, Peoples R China
来源
JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS | 2013年 / 14卷 / 11期
基金
中国国家自然科学基金; 高等学校博士学科点专项科研基金;
关键词
Shared buffer; OpenCL; Heterogeneous programming; Fine grained; CPU; GPU;
D O I
10.1631/jzus.C1300078
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
OpenCL programming provides full code portability between different hardware platforms, and can serve as a good programming candidate for heterogeneous systems, which typically consist of a host processor and several accelerators. However, to make full use of the computing capacity of such a system, programmers are requested to manage diverse OpenCL-enabled devices explicitly, including distributing the workload between different devices and managing data transfer between multiple devices. All these tedious jobs pose a huge challenge for programmers. In this paper, a distributed shared OpenCL memory (DSOM) is presented, which relieves users of having to manage data transfer explicitly, by supporting shared buffers across devices. DSOM allocates shared buffers in the system memory and treats the on-device memory as a software managed virtual cache buffer. To support fine-grained shared buffer management, we designed a kernel parser in DSOM for buffer access range analysis. A basic modified, shared, invalid cache coherency is implemented for DSOM to maintain coherency for cache buffers. In addition, we propose a novel strategy to minimize communication cost between devices by launching each necessary data transfer as early as possible. This strategy enables overlap of data transfer with kernel execution. Our experimental results show that the applicability of our method for buffer access range analysis is good, and the efficiency of DSOM is high.
引用
收藏
页码:859 / 872
页数:14
相关论文
共 27 条
  • [1] Efficient fine-grained shared buffer management for multiple OpenCL devices
    Chang-qing XUN
    Dong CHEN
    Qiang LAN
    Chun-yuan ZHANG
    Frontiers of Information Technology & Electronic Engineering, 2013, (11) : 859 - 872
  • [2] Efficient fine-grained shared buffer management for multiple OpenCL devices
    Chang-qing Xun
    Dong Chen
    Qiang Lan
    Chun-yuan Zhang
    Journal of Zhejiang University SCIENCE C, 2013, 14 : 859 - 872
  • [3] Energy Efficient Fine-grained Approach for Solar Photovoltaic Management System
    Jiang, Yuncong
    Abu Qahouq, Jaber A.
    Hassan, Ahmed
    Ahmed, Mahrous E.
    Orabi, Mohamed
    2011 IEEE 33RD INTERNATIONAL TELECOMMUNICATIONS ENERGY CONFERENCE (INTELEC), 2011,
  • [4] Fine-Grained Multitask Allocation for Participatory Sensing With a Shared Budget
    Wang, Jiangtao
    Wang, Yasha
    Zhang, Daqing
    Wang, Leye
    Xiong, Haoyi
    Helal, Abdelsalam
    He, Yuanduo
    Wang, Feng
    IEEE INTERNET OF THINGS JOURNAL, 2016, 3 (06): : 1395 - 1405
  • [5] Efficient support of fine-grained futures in Java']Java
    Zhang, Lingli
    Krintz, Chandra
    Soman, Sunil
    PROCEEDINGS OF THE 18TH IASTED INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND SYSTEMS, 2006, : 175 - +
  • [6] Fine-grained Configuration Management for Collaborative Ontology Development
    Yang, Tao
    Wu, Yijian
    Peng, Xin
    Zhao, Wenyun
    2011 35TH IEEE ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), 2011, : 230 - 238
  • [7] SMGuard: A Flexible and Fine-Grained Resource Management Framework for GPUs
    Yu, Chao
    Bai, Yuebin
    Yang, Hailong
    Cheng, Kun
    Gu, Yuhao
    Luan, Zhongzhi
    Qian, Depei
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (12) : 2849 - 2862
  • [8] Efficient and Fine-Grained Sharing of Signed Healthcare Data in Smart Healthcare
    Liu, Jianghua
    Xu, Lei
    Gu, Bruce
    Cui, Lei
    Zhu, Fei
    NETWORK AND SYSTEM SECURITY, NSS 2022, 2022, 13787 : 443 - 458
  • [9] The storage strategy for fine-grained OpenFlow multiple-table in TCAM
    Wei, Feng
    Li, Xu
    Yuan, Dong-ming
    Hu, He-fei
    Ran, Jing
    WIRELESS COMMUNICATION AND SENSOR NETWORK, 2016, : 532 - 538
  • [10] An efficient fine-grained parallel particle swarm optimization method based on gpu-acceleration
    Li, Jianming
    Wan, Danling
    Ch, Zhongxian
    Hu, Xangpei
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2007, 3 (6B): : 1707 - 1714