Bridging Python']Python to Silicon: The SODA Toolchain

被引:12
作者
Agostini, Nicolas Bohm [1 ]
Curzel, Serena [1 ]
Zhang, Jeff [4 ]
Limaye, Ankur [1 ]
Tan, Cheng [7 ]
Amatya, Vinay [2 ]
Minutoli, Marco [3 ]
Castellana, Vito Giovanni [1 ]
Manzano, Joseph [1 ]
Brooks, David [5 ]
Wei, Gu-Yeon [6 ]
Tumeo, Antonino [1 ]
机构
[1] Pacific Northwest Natl Lab, High Performance Comp Grp, Richland, WA 99354 USA
[2] Pacific Northwest Natl Lab, Richland, WA 99354 USA
[3] Pacific Northwest Natl Lab, Data Sci & Machine Intelligence Grp, Richland, WA 99354 USA
[4] Harvard Univ, Architecture Circuits & Compilers Grp, Cambridge, MA 02138 USA
[5] Harvard Univ, Sch Engn & Appl Sci, Comp Sci, Cambridge, MA 02138 USA
[6] Harvard Univ, John A Paulson Sch Engineer & Appl Sci, Elect Engn & Comp Sci, Cambridge, MA 02138 USA
[7] Microsoft, Redmond, WA USA
关键词
Hardware; Optimization; Synthesizers; Codes; Hardware design languages; Kernel; Field programmable gate arrays; Compiler Techniques; MLIR; High-Level Synthesis; Hardware generation; Silicon Compiler;
D O I
10.1109/MM.2022.3178580
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Systems performing scientific computing, data analysis, and machine learning tasks have a growing demand for application-specific accelerators that can provide high computational performance while meeting strict size and power requirements. However, the algorithms and applications that need to be accelerated are evolving at a rate that is incompatible with manual design processes based on hardware description languages. Agile hardware design tools based on compiler techniques can help by quickly producing an application-specific integrated circuit (ASIC) accelerator starting from a high-level algorithmic description. We present the software-defined accelerator (SODA) synthesizer, a modular and open-source hardware compiler that provides automated end-to-end synthesis from high-level software frameworks to ASIC implementation, relying on multilevel representations to progressively lower and optimize the input code. Our approach does not require the application developer to write any register-transfer level code, and it is able to reach up to 364 giga floating point operations per second (GFLOPS)/W efficiency (32-bit precision) on typical convolutional neural network operators.
引用
收藏
页码:78 / 88
页数:11
相关论文
共 15 条
[1]   High-Level Synthesis of Parallel Specifications Coupling Static and Dynamic Controllers [J].
Castellana, Vito Giovanni ;
Tumeo, Antonino ;
Ferrandi, Fabrizio .
2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, :192-202
[2]  
CIRCT Developers, 2020, CIRCT CIRCUIT IRCOMP
[3]  
Esmaeilzadeh H., 2021, P ICCAD, P1
[4]   Invited: Bambu: an Open-Source Research Framework for the High-Level Synthesis of Complex Applications [J].
Ferrandi, Fabrizio ;
Castellana, Vito Giovanni ;
Curzel, Serena ;
Fezzardi, Pietro ;
Fiorito, Michele ;
Lattuada, Marco ;
Minutoli, Marco ;
Pilato, Christian ;
Tumeo, Antonino .
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, :1327-1330
[5]   Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration [J].
Genc, Hasan ;
Kim, Seah ;
Amid, Alon ;
Haj-Ali, Ameer ;
Iyer, Vighnesh ;
Prakash, Pranav ;
Zhao, Jerry ;
Grubb, Daniel ;
Liew, Harrison ;
Mao, Howard ;
Ou, Albert ;
Schmidt, Colin ;
Steffl, Samuel ;
Wright, John ;
Stoica, Ion ;
Ragan-Kelley, Jonathan ;
Asanovic, Krste ;
Nikolic, Borivoje ;
Shao, Yakun Sophia .
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, :769-774
[6]  
github, POLYGEIST SCRIPT
[7]  
gitlab, FLOPOCO
[8]   PyLog: An Algorithm-Centric Python']Python-Based FPGA Programming and Synthesis Flow [J].
Huang, Sitao ;
Wu, Kun ;
Jeong, Hyunmin ;
Wang, Chengyue ;
Chen, Deming ;
Hwu, Wen-mei .
IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (12) :2015-2028
[9]   HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Reconfigurable Computing [J].
Lai, Yi-Hsiang ;
Chi, Yuze ;
Hu, Yuwei ;
Wang, Jie ;
Yu, Cody Hao ;
Zhou, Yuan ;
Cong, Jason ;
Zhang, Zhiru .
PROCEEDINGS OF THE 2019 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'19), 2019, :242-251
[10]   MLIR: Scaling Compiler Infrastructure for Domain Specific Computation [J].
Lattner, Chris ;
Amini, Mehdi ;
Bondhugula, Uday ;
Cohen, Albert ;
Davis, Andy ;
Pienaar, Jacques ;
Riddle, River ;
Shpeisman, Tatiana ;
Vasilache, Nicolas ;
Zinenko, Oleksandr .
CGO '21: PROCEEDINGS OF THE 2021 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2021, :2-14