Extending High-Level Synthesis with High-Performance Computing Performance Visualization

被引:0
作者
Huthmann, Jens [1 ]
Podobas, Artur [2 ]
Sommer, Lukas [3 ]
Koch, Andreas [3 ]
Sano, Kentaro [1 ]
机构
[1] RIKEN, Ctr Computat Sci, Kobe, Hyogo, Japan
[2] KTH, Royal Inst Technol, Stockholm, Sweden
[3] Tech Univ Darmstadt, Embedded Syst & Applicat Grp, Darmstadt, Germany
来源
2020 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2020) | 2020年
关键词
Visualization; FPGA; HLS; High-Level Synthesis; High-Performance Computing; Performance Optimization;
D O I
10.1109/CLUSTER49012.2020.00047
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The recent maturity in High-Level Synthesis (HLS) has renewed the interest of using Field-Programmable Gate-Arrays (FPGAs) to accelerate High-Performance Computing (HPC) applications. Today, several studies have shown performance- and power-benefits of using FPGAs compared to existing approaches for a number of application kernels with ample room for improvements. Unfortunately, modern HLS tools offer little support to gain clarity and insight regarding why a certain application behaves as it does on the FPGA, and most experts rely on intuition or abstract performance models. In this work, we hypothesize that existing profiling and visualization tools used in the HPC domain are also usable for understanding performance on FPGAs. We extend an existing HLS tool-chain to support Paraver - a state-of-the-art visualization and profiling tool well-known in HPC. We describe how each of the events and states are collected, and empirically quantify its hardware overhead. Finally, we practically apply our contribution to two different applications, demonstrating how the tool can be used to provide unique insights into application execution and how it can be used to guide optimizations.
引用
收藏
页码:371 / 380
页数:10
相关论文
共 29 条
[1]  
Amdahl G. M., 1967, P APR 18 20 1967 SPR, P483, DOI DOI 10.1145/1465482.1465560
[2]  
[Anonymous], 1999, P DEP DEF HPCMP US G
[3]  
Calagar N., 2014, FIELD PROGRAMMABLE L, P1, DOI [10.1109/FPL.2014.6927496, DOI 10.1109/FPL.2014.6927496]
[4]  
Canis A, 2011, FPGA 11: PROCEEDINGS OF THE 2011 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, P33
[5]  
Curreri J, 2012, FPGA 12: PROCEEDINGS OF THE 2012 ACM-SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, P33
[6]   Performance Analysis Framework for High-Level Language Applications in Reconfigurable Computing [J].
Curreri, John ;
Koehler, Seth ;
George, Alan D. ;
Holland, Brian ;
Garcia, Rafael .
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2010, 3 (01)
[7]  
Czajkowski T.S., 2012, 22nd international conference on field programmable logic and applications (FPL), P531, DOI [DOI 10.1109/FPL.2012.6339272, 10 . 1109 / FPL . 2012.6339272, 10. 1109/FPL.2012.6339272]
[8]   OpenMP: An industry standard API for shared-memory programming [J].
Dagum, L ;
Menon, R .
IEEE COMPUTATIONAL SCIENCE & ENGINEERING, 1998, 5 (01) :46-55
[9]  
Dou S.Yong., 2005, Proceedings of the 2005 ACM/SIGDA 13th International Symposium on Field-Programmable Gate Arrays, FPGA'05, P86, DOI DOI 10.1145/1046192.1046204
[10]   OnipSs: A PROPOSAL FOR PROGRAMMING HETEROGENEOUS MULTI-CORE ARCHITECTURES [J].
Duran, Alejandro ;
Ayguade, Eduard ;
Badia, Rosa M. ;
Labahta, Jesus ;
Martinell, Luis ;
Martorell, Xavier ;
Planas, Judit .
PARALLEL PROCESSING LETTERS, 2011, 21 (02) :173-193