Design Space Exploration for Scalable DNN Accelerators Using a Memory-Centric Analytical Model for HW/SW Co-Design

被引:0
作者
Huang, Wei-Chun [1 ]
Tang, Chih-Wei [1 ]
Chang, Kuei-Chung [2 ]
Chen, Tien-Fu [1 ]
Hsieh, Hsiang-Cheng [1 ]
Tsai, Ming-Hsuan [1 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Hsinchu, Taiwan
[2] Natl Yunlin Univ Sci & Technol, Int Grad Sch Artificial Intelligence, Touliu, Taiwan
关键词
DNN Accelerator; Hardware/Software Co-design; Design Space Explo-ration; Traffic Generator; Model Mapping; Traffic Optimization; HARDWARE;
D O I
10.1145/3729227
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As Deep Neural Network (DNN) models became more complex, the escalating computational demands on hardware made DNN accelerators a critical research topic. The rapid growth of DNN models required DNN accelerators to keep pace with these computational demands. However, the cost of hardware design was significant, and hardware and software were tightly coupled in the design of DNN accelerators. Much research on HW/SW co-design was evident, highlighting the importance of having a comprehensive framework to help find the optimal hardware and software design during the design phase. The cost models used in most of the current research relied on data reuse and mathematical estimation to calculate costs, an approach that was fast but inaccurate. In this article, we propose a framework for HW/SW co-design and introduce a hybrid cost model based on Gem5 that provides fast and precise performance evaluation. The framework uses a memory-centric approach to accurately model off-chip memory behavior. In addition, we discuss how to find the best design in a large co-design space and integrate a design point through a traffic generator and a cost model. Finally, we demonstrate that our framework can accurately assist DNN accelerator developers in exploring the optimal hardware and software co-design quickly and efficiently.
引用
收藏
页数:29
相关论文
共 35 条
[1]   Marvel: A Data-Centric Approach for Mapping Deep Learning Operators on Spatial Accelerators [J].
Chatarasi, Prasanth ;
Kwon, Hyoukjun ;
Parashar, Angshuman ;
Pellauer, Michael ;
Krishna, Tushar ;
Sarkar, Vivek .
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (01)
[2]  
Chen TQ, 2018, PROCEEDINGS OF THE 13TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P579
[3]  
Chen TQ, 2018, ADV NEUR IN, V31
[4]   DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning [J].
Chen, Tianshi ;
Du, Zidong ;
Sun, Ninghui ;
Wang, Jia ;
Wu, Chengyong ;
Chen, Yunji ;
Temam, Olivier .
ACM SIGPLAN NOTICES, 2014, 49 (04) :269-283
[5]   Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices [J].
Chen, Yu-Hsin ;
Yange, Tien-Ju ;
Emer, Joel S. ;
Sze, Vivienne .
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) :292-308
[6]   Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks [J].
Chen, Yu-Hsin ;
Emer, Joel ;
Sze, Vivienne .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :367-379
[7]   dMazeRunner: Executing Perfectly Nested Loops on Dataflow Accelerators [J].
Dave, Shail ;
Kim, Youngbin ;
Avancha, Sasikanth ;
Lee, Kyoungwoo ;
Shrivastava, Aviral .
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2019, 18 (05)
[8]   A fast and elitist multiobjective genetic algorithm: NSGA-II [J].
Deb, K ;
Pratap, A ;
Agarwal, S ;
Meyarivan, T .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (02) :182-197
[9]  
Du ZD, 2015, 2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), P92, DOI [10.1145/2872887.2750389, 10.1145/2749469.2750389]
[10]   A Configurable Cloud-Scale DNN Processor for Real-Time AI [J].
Fowers, Jeremy ;
Ovtcharov, Kalin ;
Papamichael, Michael ;
Massengill, Todd ;
Liu, Ming ;
Lo, Daniel ;
Alkalay, Shlomi ;
Haselman, Michael ;
Adams, Logan ;
Ghandi, Mahdi ;
Heil, Stephen ;
Patel, Prerak ;
Sapek, Adam ;
Weisz, Gabriel ;
Woods, Lisa ;
Lanka, Sitaram ;
Reinhardt, Steven K. ;
Caulfield, Adrian M. ;
Chung, Eric S. ;
Burger, Doug .
2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, :1-14