An OpenCL Framework for Heterogeneous Multicores with Local Memory

被引:0
|
作者
Lee, Jaejin [1 ]
Kim, Jungwon [1 ]
Seo, Sangmin [1 ]
Kim, Seungkyun [1 ]
Park, Jungho [1 ]
Kim, Honggyu [1 ]
Thanh Tuan Dao [1 ]
Cho, Yongjin [1 ]
Seo, Sung Jong
Lee, Seung Hak
Cho, Seung Mo
Song, Hyo Jung
Suh, Sang-Bum
Choi, Jong-Deok
机构
[1] Seoul Natl Univ, Sch Comp Sci & Engn, Seoul 151744, South Korea
来源
PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES | 2010年
关键词
OpenCL; Compilers; Runtime; Software-managed caches; Memory consistency; Work-item coalescing; Preload-poststore buffering;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present the design and implementation of an Open Computing Language (OpenCL) framework that targets heterogeneous accelerator multicore architectures with local memory. The architecture consists of a general-purpose processor core and multiple accelerator cores that typically do not have any cache. Each accelerator core, instead, has a small internal local memory. Our OpenCL runtime is based on software-managed caches and coherence protocols that guarantee OpenCL memory consistency to overcome the limited size of the local memory. To boost performance, the runtime relies on three source-code transformation techniques, work-item coalescing, web-based variable expansion and preload-poststore buffering, performed by our OpenCL C source-to-source translator. Work-item coalescing is a procedure to serialize multiple SPMD-like tasks that execute concurrently in the presence of barriers and to sequentially run them on a single accelerator core. It requires the web-based variable expansion technique to allocate local memory for private variables. Preload-poststore buffering is a buffering technique that eliminates the overhead of software cache accesses. Together with work-item coalescing, it has a synergistic effect on boosting performance. We show the effectiveness of our OpenCL framework, evaluating its performance with a system that consists of two Cell BE processors. The experimental result shows that our approach is promising.
引用
收藏
页码:193 / 204
页数:12
相关论文
共 50 条
  • [41] Melia: A MapReduce Framework on OpenCL-Based FPGAs
    Wang, Zeke
    Zhang, Shuhao
    He, Bingsheng
    Zhang, Wei
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (12) : 3547 - 3560
  • [42] OpenCL as a Unified Programming Model for Heterogeneous CPU/GPU Clusters
    Kim, Jungwon
    Seo, Sangmin
    Lee, Jun
    Nah, Jeongho
    Jo, Gangwon
    Lee, Jaejin
    ACM SIGPLAN NOTICES, 2012, 47 (08) : 299 - 300
  • [43] A high performance parallel DCT with OpenCL on heterogeneous computing environment
    Kim, Cheong Ghil
    Choi, Yong Soo
    MULTIMEDIA TOOLS AND APPLICATIONS, 2013, 64 (02) : 475 - 489
  • [44] A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL
    Grewe, Dominik
    O'Boyle, Michael F. P.
    COMPILER CONSTRUCTION, 2011, 6601 : 286 - 305
  • [45] A high performance parallel DCT with OpenCL on heterogeneous computing environment
    Cheong Ghil Kim
    Yong Soo Choi
    Multimedia Tools and Applications, 2013, 64 : 475 - 489
  • [46] Hyperion: A Generic and Distributed Mobile Offloading Framework on OpenCL
    Fu, Ziyan
    Ren, Ju
    Liu, Yunxin
    Cao, Ting
    Zhang, Deyu
    Zhou, Yuezhi
    Zhang, Yaoxue
    PROCEEDINGS OF THE TWENTIETH ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, SENSYS 2022, 2022, : 607 - 621
  • [47] Adaptive OpenCL Computation Offloading Framework on Mobile Device
    Valery, Olivier
    Hung, Wei-Shu
    Chou, Ju-Cheng
    Liu, Pangfeng
    Wu, Jan-Jan
    INTELLIGENT SYSTEMS AND APPLICATIONS (ICS 2014), 2015, 274 : 1335 - 1344
  • [48] Towards a Model Transformation Tool on the Top of the OpenCL Framework
    Fekete, Tamas
    Mezei, Gergely
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON MODEL-DRIVEN ENGINEERING AND SOFTWARE DEVELOPMENT (MODELSWARD 2016), 2016, : 355 - 360
  • [49] Function portability of molecular dynamics on heterogeneous parallel architectures with OpenCL
    Halver, Rene
    Homberg, Wilhelm
    Sutmann, Godehard
    JOURNAL OF SUPERCOMPUTING, 2018, 74 (04) : 1522 - 1533
  • [50] OpenCL-Darknet: implementation and optimization of OpenCL-based deep learning object detection framework
    Yongbon Koo
    Sunghoon Kim
    Young-guk Ha
    World Wide Web, 2021, 24 : 1299 - 1319