MapZero: Mapping for Coarse-grained Reconfigurable Architectures with Reinforcement Learning and Monte-Carlo Tree Search

被引:6
|
作者
Kong, Xiangyu [1 ]
Huang, Yi [1 ]
Zhu, Jianfeng [1 ]
Man, Xingchen [1 ]
Liu, Yang [2 ]
Feng, Chunyang [2 ]
Gou, Pengfei [3 ]
Tang, Minggui [3 ]
Wei, Shaojun [1 ]
Liu, Leibo [1 ]
机构
[1] Tsinghua Univ, BNRist, Sch Integrated Circuits, Beijing, Peoples R China
[2] GBA, Innovat Inst High Performance Server, Guangzhou, Guangdong, Peoples R China
[3] HEXIN Technol Co Ltd, Guangzhou, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Coarse-Grained Reconfigurable Architecture; Compiler; Graph Neural Network; Reinforcement Learning; DATA-FLOW GRAPH; CGRA; ALGORITHM; FRAMEWORK; SHOGI; CHESS; GO;
D O I
10.1145/3579371.3589081
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Coarse-grained reconfigurable architecture (CGRA) has become a promising candidate for data-intensive computing due to its flexibility and high energy efficiency. CGRA compilers map data flow graphs (DFGs) extracted from applications onto CGRAs, playing a fundamental role in fully exploiting hardware resources for acceleration. Yet the existing compilers are time-demanding and cannot guarantee optimal results due to the traversal search of enormous search spaces brought about by the spatio-temporal flexibility of CGRA structures and the complexity of DFGs. Inspired by the amazing progress in reinforcement learning (RL) and Monte-Carlo tree search (MCTS) for real-world problems, we consider constructing a compiler that can learn from past experiences and comprehensively understand the target DFG and CGRA. In this paper, we propose an architecture-aware compiler for CGRAs based on RL and MCTS, called MapZero - a framework to automatically extract the characteristics of DFG and CGRA hardware and map operations onto varied CGRA fabrics. We apply Graph Attention Network to generate an adaptive embedding for DFGs and also model the functionality and interconnection status of the CGRA, aiming at training an RL agent to perform placement and routing intelligently. Experimental results show that MapZero can generate superior-quality mappings and reduce compilation time hundreds of times compared to state-of-the-art methods. MapZero can find high-quality mappings very quickly when the feasible solution space is rather small and all other compilers fail. We also demonstrate the scalability and broad applicability of our framework.
引用
收藏
页码:646 / 659
页数:14
相关论文
共 50 条
  • [1] Regular mapping for coarse-grained reconfigurable architectures
    Hannig, F
    Dutta, H
    Teich, J
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 57 - 60
  • [2] Monte-Carlo tree search for Bayesian reinforcement learning
    Ngo Anh Vien
    Wolfgang Ertel
    Viet-Hung Dang
    TaeChoong Chung
    Applied Intelligence, 2013, 39 : 345 - 353
  • [3] Monte-Carlo tree search for Bayesian reinforcement learning
    Ngo Anh Vien
    Ertel, Wolfgang
    Viet-Hung Dang
    Chung, TaeChoong
    APPLIED INTELLIGENCE, 2013, 39 (02) : 345 - 353
  • [4] Mapping Algorithm for Coarse-Grained Reconfigurable Multimedia Architectures
    Chen, Naijin
    Jiang, Jianhui
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 288 - 293
  • [5] Mapping Imperfect Loops to Coarse-Grained Reconfigurable Architectures
    Sim, Hyeonuk
    Lee, Hongsik
    Seo, Seongseok
    Lee, Jongeun
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2016, 35 (07) : 1092 - 1104
  • [6] A spatial mapping algorithm for heterogeneous coarse-grained reconfigurable architectures
    Ahn, Minwook
    Yoon, Jonghee W.
    Paek, Yunheung
    Kim, Yoonjin
    Kiemb, Mary
    Choi, Kiyoung
    2006 DESIGN AUTOMATION AND TEST IN EUROPE, VOLS 1-3, PROCEEDINGS, 2006, : 361 - +
  • [7] High Throughput Data Mapping for Coarse-Grained Reconfigurable Architectures
    Kim, Yongjoo
    Lee, Jongeun
    Shrivastava, Aviral
    Yoon, Jonghee W.
    Cho, Doosan
    Paek, Yunheung
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2011, 30 (11) : 1599 - 1609
  • [8] An algorithm for mapping loops onto coarse-grained reconfigurable architectures
    Lee, JE
    Choi, K
    Dutt, ND
    ACM SIGPLAN NOTICES, 2003, 38 (07) : 183 - 188
  • [9] Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures
    Liu, Dajiang
    Yin, Shouyi
    Peng, Yu
    Liu, Leibo
    Wei, Shaojun
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2015, 23 (11) : 2581 - 2594
  • [10] Efficient Mapping of CDFG onto Coarse-Grained Reconfigurable Array Architectures
    Das, Satyajit
    Martin, Kevin J. M.
    Coussy, Philippe
    Rossi, Davide
    Benini, Luca
    2017 22ND ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2017, : 127 - 132