An automatic mapping technique for OpenACC kernel code based on deeply fused and heterogeneous many-core architecture

被引:0
|
作者
Zhang, Libo [1 ]
Mao, Xingquan [1 ]
You, Hongtao [1 ]
Gu, Long [1 ]
Jiang, Xiaocheng [1 ]
机构
[1] Wuxi Jiangnan Inst Comp Technol, Wuxi, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Supercomputer; Heterogeneous; Many-core; Fused; OpenACC; Data layout; Automatic mapping;
D O I
10.1007/s42514-020-00050-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Now the OpenACC has become a popular programming interface for many-core application programming. Internationally, a lot of research have been done on OpenACC for CPU + GPU heterogeneous many-core architecture. Among them, the PGI OpenACC compiler developed by NVIDIA is the most advanced one. But there are few research on OpenACC related to the Home Grown Heterogeneous Many-Core (HGHM) Architecture that is different from GPU. This paper proposes an automatic mapping technique for OpenACC kernel code based on the OpenACC compiler to a heterogeneous and deeply fused many-core architecture. Our approach uses the static analysis and feedback dynamic analysis of the compiler to perform the automatic mapping of the program parallel kernel code to many-core devices, and it greatly improves the transformation quality of the compiler. Experimental results show that this technique can greatly improve the efficiency of using OpenACC to port applications to heterogeneous and fused many-core system without impacting program acceleration performance.
引用
收藏
页码:323 / 331
页数:9
相关论文
共 21 条
  • [1] An automatic mapping technique for OpenACC kernel code based on deeply fused and heterogeneous many-core architecture
    Libo Zhang
    Xingquan Mao
    Hongtao You
    Long Gu
    Xiaocheng Jiang
    CCF Transactions on High Performance Computing, 2020, 2 : 323 - 331
  • [2] Study on the Mapping of Streaming Application on Many-Core Architecture
    Yu, Lei
    Liu, Zhiyong
    Fan, Dongrui
    Ma, Yike
    Song, Fenglong
    Ye, Xiaochun
    Xu, Weizhi
    INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS II, PTS 1-3, 2011, 58-60 : 298 - 303
  • [3] Mapping Routing Lookup Algorithm on Many-Core Architecture based on SPM and Cache Mixed Method
    Yu, Lei
    Liu, Zhiyong
    Fan, Dongrui
    Ma, Yike
    Song, Fenglong
    Ye, Xiaochun
    Xu, Weizhi
    INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS II, PTS 1-3, 2011, 58-60 : 1226 - 1231
  • [4] Towards optimal scheduling policy for heterogeneous memory architecture in many-core system
    Park, Geunchul
    Rho, Seungwoo
    Kim, Jik-Soo
    Nam, Dukyun
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (01): : 121 - 133
  • [5] Towards optimal scheduling policy for heterogeneous memory architecture in many-core system
    Geunchul Park
    Seungwoo Rho
    Jik-Soo Kim
    Dukyun Nam
    Cluster Computing, 2019, 22 : 121 - 133
  • [6] Coarray-based load balancing on heterogeneous and many-core architectures
    Cardellini, Valeria
    Fanfarillo, Alessandro
    Filippone, Salvatore
    PARALLEL COMPUTING, 2017, 68 : 45 - 58
  • [7] DAG Scheduling Algorithm for a Cluster-Based Many-Core Architecture
    Kitagawa, Yuto
    Ishigooka, Tasuku
    Azumi, Takuya
    2018 IEEE 16TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2018), 2018, : 150 - 157
  • [8] Distributed SDN Architecture for NoC-based Many-core SoCs
    Ruaro, Marcelo
    Velloso, Nedison
    Jantsch, Axel
    Moraes, Fernando G.
    PROCEEDINGS OF THE 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS'19), 2019,
  • [9] NoC-based Many-Core Processor Using CUSPARC Architecture
    Soliman, Muhammad R.
    Fahmy, Hossam A. H.
    Habib, S. E. -D.
    2014 26TH INTERNATIONAL CONFERENCE ON MICROELECTRONICS (ICM), 2014, : 84 - 87
  • [10] BLOCK-BASED HARDWARE SCHEDULER DESIGN ON MANY-CORE ARCHITECTURE
    Ju, Lihan
    Pan, Ping
    Quan, Baixing
    Chen, Tianzhou
    Wu, Minghui
    2012 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2012, : 814 - 819