An automatic mapping technique for OpenACC kernel code based on deeply fused and heterogeneous many-core architecture

被引：0

作者：

Zhang, Libo ^{[1
]}

Mao, Xingquan ^{[1
]}

You, Hongtao ^{[1
]}

Gu, Long ^{[1
]}

Jiang, Xiaocheng ^{[1
]}

机构：

[1] Wuxi Jiangnan Inst Comp Technol, Wuxi, Jiangsu, Peoples R China

来源：

CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING | 2020年 / 2卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Supercomputer; Heterogeneous; Many-core; Fused; OpenACC; Data layout; Automatic mapping;

D O I：

10.1007/s42514-020-00050-9

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Now the OpenACC has become a popular programming interface for many-core application programming. Internationally, a lot of research have been done on OpenACC for CPU + GPU heterogeneous many-core architecture. Among them, the PGI OpenACC compiler developed by NVIDIA is the most advanced one. But there are few research on OpenACC related to the Home Grown Heterogeneous Many-Core (HGHM) Architecture that is different from GPU. This paper proposes an automatic mapping technique for OpenACC kernel code based on the OpenACC compiler to a heterogeneous and deeply fused many-core architecture. Our approach uses the static analysis and feedback dynamic analysis of the compiler to perform the automatic mapping of the program parallel kernel code to many-core devices, and it greatly improves the transformation quality of the compiler. Experimental results show that this technique can greatly improve the efficiency of using OpenACC to port applications to heterogeneous and fused many-core system without impacting program acceleration performance.

引用

页码：323 / 331

页数：9

共 21 条

[1] An automatic mapping technique for OpenACC kernel code based on deeply fused and heterogeneous many-core architecture
Libo Zhang
Xingquan Mao
Hongtao You
Long Gu
Xiaocheng Jiang
CCF Transactions on High Performance Computing, 2020, 2 : 323 - 331
[2] Study on the Mapping of Streaming Application on Many-Core Architecture
Yu, Lei
Liu, Zhiyong
Fan, Dongrui
Ma, Yike
Song, Fenglong
Ye, Xiaochun
Xu, Weizhi
INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS II, PTS 1-3, 2011, 58-60 : 298 - 303
[3] Mapping Routing Lookup Algorithm on Many-Core Architecture based on SPM and Cache Mixed Method
Yu, Lei
Liu, Zhiyong
Fan, Dongrui
Ma, Yike
Song, Fenglong
Ye, Xiaochun
Xu, Weizhi
INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS II, PTS 1-3, 2011, 58-60 : 1226 - 1231
[4] Towards optimal scheduling policy for heterogeneous memory architecture in many-core system
Park, Geunchul
Rho, Seungwoo
Kim, Jik-Soo
Nam, Dukyun
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (01): : 121 - 133
[5] Towards optimal scheduling policy for heterogeneous memory architecture in many-core system
Geunchul Park
Seungwoo Rho
Jik-Soo Kim
Dukyun Nam
Cluster Computing, 2019, 22 : 121 - 133
[6] Coarray-based load balancing on heterogeneous and many-core architectures
Cardellini, Valeria
Fanfarillo, Alessandro
Filippone, Salvatore
PARALLEL COMPUTING, 2017, 68 : 45 - 58
[7] DAG Scheduling Algorithm for a Cluster-Based Many-Core Architecture
Kitagawa, Yuto
Ishigooka, Tasuku
Azumi, Takuya
2018 IEEE 16TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2018), 2018, : 150 - 157
[8] Distributed SDN Architecture for NoC-based Many-core SoCs
Ruaro, Marcelo
Velloso, Nedison
Jantsch, Axel
Moraes, Fernando G.
PROCEEDINGS OF THE 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS'19), 2019,
[9] NoC-based Many-Core Processor Using CUSPARC Architecture
Soliman, Muhammad R.
Fahmy, Hossam A. H.
Habib, S. E. -D.
2014 26TH INTERNATIONAL CONFERENCE ON MICROELECTRONICS (ICM), 2014, : 84 - 87
[10] BLOCK-BASED HARDWARE SCHEDULER DESIGN ON MANY-CORE ARCHITECTURE
Ju, Lihan
Pan, Ping
Quan, Baixing
Chen, Tianzhou
Wu, Minghui
2012 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2012, : 814 - 819

← 1 2 3 →