A CAD-based methodology to optimize HLS code via the Roofline model

被引:15
作者
Siracusa, Marco [1 ]
Rabozzi, Marco [2 ]
Del Sozzo, Emanuele [1 ]
Di Tucci, Lorenzo [2 ]
Williams, Samuel [3 ]
Santambrogio, Marco D. [1 ]
机构
[1] Politecn Milan, Milan, Italy
[2] Huxelerate srl, Milan, Italy
[3] Lawrence Berkeley Natl Lab, Berkeley, CA USA
来源
2020 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED-DESIGN (ICCAD) | 2020年
关键词
Roofline Model; FPGA; High-Performance Computing; CAD; DSE; DESIGN SPACE EXPLORATION; PERFORMANCE-MODEL; EFFICIENT;
D O I
10.1145/3400302.3415730
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The intrinsic complexity of modern computing systems requires structured methods for analyzing and optimizing application performance. In this context, the Roofline model proposes an intuitive and visual method providing performance insight and optimization guidance for a given architecture. Although this methodology successfully models multicore and GPU performance optimizations, the original formulation does not directly apply to FPGA devices. For this reason, we propose a Roofline model analysis for reconfigurable architectures and an associated CAD tool for assisting HLS optimization of C/C++ applications. We firstly model FPGA attainable performance by means of an analytical method. Then, we integrate locality walls and a DSE engine for an enhanced optimization process. Starting from a software version of the N-body algorithm, we firstly illustrate how our methodology helps at quickly achieving performance comparable to a state-of-the-art FPGA bespoke implementation. Then, we illustrate an assisted platform porting of the Smith-Waterman sequence alignment providing a 9x speedup. Finally, we evaluated the single DSE engine on the Poly-Bench test suite and achieved performance improvements up to 14.36x compared to previous automated solutions in the literature.
引用
收藏
页数:9
相关论文
共 42 条
  • [1] Programming languages for data-Intensive HPC applications: A systematic mapping study
    Amaral, Vasco
    Norberto, Beatriz
    Goulao, Miguel
    Aldinucci, Marco
    Benkner, Siegfried
    Bracciali, Andrea
    Carreira, Paulo
    Celms, Edgars
    Correia, Luis
    Grelck, Clemens
    Karatza, Helen
    Kessler, Christoph
    Kilpatrick, Peter
    Martiniano, Hugo
    Mavridis, Ilias
    Pllana, Sabri
    Respicio, Ana
    Simao, Jose
    Veiga, Luis
    Visa, Ari
    [J]. PARALLEL COMPUTING, 2020, 91
  • [2] Abstract Machine Models and Proxy Architectures for Exascale Computing
    Ang, J. A.
    Barrett, R. F.
    Benner, R. E.
    Burke, D.
    Chan, C.
    Cook, J.
    Donofrio, D.
    Hammond, S. D.
    Hemmert, K. S.
    Kelly, S. M.
    Le, H.
    Leung, V. J.
    Resnick, D. R.
    Rodrigues, A. F.
    Shalf, J.
    Stark, D.
    Unat, D.
    Wright, N. J.
    [J]. 2014 HARDWARE-SOFTWARE CO-DESIGN FOR HIGH PERFORMANCE COMPUTING (CO-HPC), 2014, : 25 - 32
  • [3] [Anonymous], Amazon EC2 F1 Instances
  • [4] [Anonymous], 2008, SC 08
  • [5] Bacon DF, 2013, COMMUN ACM, V56, P56, DOI 10.1145/2436256.2436271
  • [6] Berkeley Lab, EMPIRICAL ROOFLINE T
  • [7] CHOI YJ, 2018, 2018 INT S ANT PROP, pNI837
  • [8] Codina J. M., 2002, Conference Proceedings of the 2002 International Conference on SUPERCOMPUTING, P97, DOI 10.1145/514191.514208
  • [9] Cong Jason, 2013, Languages and Compilers for Parallel Computing. 25th International Workshop (LCPC 2012). Revised Selected Papers, P143, DOI 10.1007/978-3-642-37658-0_10
  • [10] An efficient and versatile scheduling algorithm based on SDC formulation
    Cong, Jason
    Zhang, Zhiru
    [J]. 43RD DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 2006, 2006, : 433 - +