MapZero: Mapping for Coarse-grained Reconfigurable Architectures with Reinforcement Learning and Monte-Carlo Tree Search

被引:15
作者
Kong, Xiangyu [1 ]
Huang, Yi [1 ]
Zhu, Jianfeng [1 ]
Man, Xingchen [1 ]
Liu, Yang [2 ]
Feng, Chunyang [2 ]
Gou, Pengfei [3 ]
Tang, Minggui [3 ]
Wei, Shaojun [1 ]
Liu, Leibo [1 ]
机构
[1] Tsinghua Univ, BNRist, Sch Integrated Circuits, Beijing, Peoples R China
[2] GBA, Innovat Inst High Performance Server, Guangzhou, Guangdong, Peoples R China
[3] HEXIN Technol Co Ltd, Guangzhou, Guangdong, Peoples R China
来源
PROCEEDINGS OF THE 2023 THE 50TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2023 | 2023年
基金
中国国家自然科学基金;
关键词
Coarse-Grained Reconfigurable Architecture; Compiler; Graph Neural Network; Reinforcement Learning; DATA-FLOW GRAPH; CGRA; ALGORITHM; FRAMEWORK; SHOGI; CHESS; GO;
D O I
10.1145/3579371.3589081
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Coarse-grained reconfigurable architecture (CGRA) has become a promising candidate for data-intensive computing due to its flexibility and high energy efficiency. CGRA compilers map data flow graphs (DFGs) extracted from applications onto CGRAs, playing a fundamental role in fully exploiting hardware resources for acceleration. Yet the existing compilers are time-demanding and cannot guarantee optimal results due to the traversal search of enormous search spaces brought about by the spatio-temporal flexibility of CGRA structures and the complexity of DFGs. Inspired by the amazing progress in reinforcement learning (RL) and Monte-Carlo tree search (MCTS) for real-world problems, we consider constructing a compiler that can learn from past experiences and comprehensively understand the target DFG and CGRA. In this paper, we propose an architecture-aware compiler for CGRAs based on RL and MCTS, called MapZero - a framework to automatically extract the characteristics of DFG and CGRA hardware and map operations onto varied CGRA fabrics. We apply Graph Attention Network to generate an adaptive embedding for DFGs and also model the functionality and interconnection status of the CGRA, aiming at training an RL agent to perform placement and routing intelligently. Experimental results show that MapZero can generate superior-quality mappings and reduce compilation time hundreds of times compared to state-of-the-art methods. MapZero can find high-quality mappings very quickly when the feasible solution space is rather small and all other compilers fail. We also demonstrate the scalability and broad applicability of our framework.
引用
收藏
页码:646 / 659
页数:14
相关论文
共 86 条
[71]   IntelliNoC: A Holistic Design Framework for Energy-Efficient and Reliable On-Chip Communication for Manycores [J].
Wang, Ke ;
Louri, Ahmed ;
Karanth, Avinash ;
Bunescu, Razvan .
PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, :589-600
[72]   DSAGEN: Synthesizing Programmable Spatial Accelerators [J].
Weng, Jian ;
Liu, Sihao ;
Dadu, Vidushi ;
Wang, Zhengrong ;
Shah, Preyas ;
Nowatzki, Tony .
2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020), 2020, :268-281
[73]   A Hybrid Systolic-Dataflow Architecture for Inductive Matrix Algorithms [J].
Weng, Jian ;
Liu, Sihao ;
Wang, Zhengrong ;
Dadu, Vidushi ;
Nowatzki, Tony .
2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, :703-716
[74]   HiMap: Fast and Scalable High-Quality Mapping on CGRA via Hierarchical Abstraction [J].
Wijerathne, Dhananiaya ;
Li, Zhaoying ;
Pathania, Anuj ;
Mitra, Tulika ;
Thiele, Lothar .
PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, :1192-1197
[75]  
Wijerathne Dhananjaya, 2022, PANORAMA: Divide-and-Conquer Approach for Mapping Complex Loop Kernels on CGRA, P6
[76]   HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation [J].
Xiao, Qingcheng ;
Zheng, Size ;
Wu, Bingzhe ;
Xu, Pengcheng ;
Qian, Xuehai ;
Liang, Yun .
2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021), 2021, :1055-1068
[77]   GoodFloorplan: Graph Convolutional Network and Reinforcement Learning-Based Floorplanning [J].
Xu, Qi ;
Geng, Hao ;
Chen, Song ;
Yuan, Bo ;
Zhuo, Cheng ;
Kang, Yi ;
Wen, Xiaoqing .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (10) :3492-3502
[78]   ScaleHLS: A New Scalable High-Level Synthesis Framework on Multi-Level Intermediate Representation [J].
Ye, Hanchen ;
Hao, Cong ;
Cheng, Jianyi ;
Jeong, Hyunmin ;
Huang, Jack ;
Neuendorffer, Stephen ;
Chen, Deming .
2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), 2022, :741-755
[79]  
Yin SY, 2017, IEEE INT SYMP CIRC S, P216
[80]   Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures [J].
Yin, Shouyi ;
Yao, Xianqing ;
Liu, Dajiang ;
Liu, Leibo ;
Wei, Shaojun .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2016, 24 (05) :1895-1908