A High-Performance and Energy-Efficient Photonic Architecture for Multi-DNN Acceleration

被引:1
作者
Li, Yuan [1 ]
Louri, Ahmed [1 ]
Karanth, Avinash [2 ]
机构
[1] George Washington Univ, Dept Elect & Comp Engn, Washington, DC 20052 USA
[2] Ohio Univ, Sch Elect Engn & Comp Sci, Athens, OH 45701 USA
基金
美国国家科学基金会;
关键词
Accelerator; dataflow; deep neural network; silicon photonics;
D O I
10.1109/TPDS.2023.3327535
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Large-scale deep neural network (DNN) accelerators are poised to facilitate the concurrent processing of diverse DNNs, imposing demanding challenges on the interconnection fabric. These challenges encompass overcoming performance degradation and energy increase associated with system scaling while also necessitating flexibility to support dynamic partitioning and adaptable organization of compute resources. Nevertheless, conventional metallic-based interconnects frequently confront inherent limitations in scalability and flexibility. In this paper, we leverage silicon photonic interconnects and adopt an algorithm-architecture co-design approach to develop MDA, a DNN accelerator meticulously crafted to empower high-performance and energy-efficient concurrent processing of diverse DNNs. Specifically, MDA consists of three novel components: 1) a resource allocation algorithm that assigns compute resources to concurrent DNNs based on their computational demands and priorities; 2) a dataflow selection algorithm that determines off-chip and on-chip dataflows for each DNN, with the objectives of minimizing off-chip and on-chip memory accesses, respectively; 3) a flexible silicon photonic network that can be dynamically segmented into sub-networks, each interconnecting the assigned compute resources of a certain DNN while adapting to the communication patterns dictated by the selected on-chip dataflow. Simulation results show that the proposed MDA accelerator outperforms other state-of-the-art multi-DNN accelerators, including PREMA, AI-MT, Planaria, and HDA. MDA accelerator achieves a speedup of 3.6, accompanied by substantial improvements of 7.3x, 12.7x, and 9.2x in energy efficiency, service-level agreement (SLA) satisfaction rate, and fairness, respectively.
引用
收藏
页码:46 / 58
页数:13
相关论文
共 55 条
[1]  
[Anonymous], 2015, Proceedings of the 25th Edition on Great Lakes Symposium on VLSI, DOI [DOI 10.1145/2742060.2743766, 10.1145/2742060.2743766]
[2]  
Atabaki A.H., 2013, Opt. Express, V21, p15 706
[3]   A Multi-Neural Network Acceleration Architecture [J].
Baek, Eunjin ;
Kwon, Dongup ;
Kim, Jangwoo .
2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020), 2020, :940-953
[4]  
Bergman K., 2014, Photonic Network -on-Chip Design
[5]  
Chakradhar S, 2010, CONF PROC INT SYMP C, P247, DOI 10.1145/1816038.1815993
[6]   DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning [J].
Chen, Tianshi ;
Du, Zidong ;
Sun, Ninghui ;
Wang, Jia ;
Wu, Chengyong ;
Chen, Yunji ;
Temam, Olivier .
ACM SIGPLAN NOTICES, 2014, 49 (04) :269-283
[7]   Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks [J].
Chen, Yu-Hsin ;
Emer, Joel ;
Sze, Vivienne .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :367-379
[8]   PREMA: A Predictive Multi-task Scheduling Algorithm For Preemptible Neural Processing Units [J].
Choi, Yujeong ;
Rhu, Minsoo .
2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, :220-233
[9]   Machine Learning With Neuromorphic Photonics [J].
de Lima, Thomas Ferreira ;
Peng, Hsuan-Tung ;
Tait, Alexander N. ;
Nahmias, Mitchell A. ;
Miller, Heidi B. ;
Shastri, Bhavin J. ;
Prucnal, Paul R. .
JOURNAL OF LIGHTWAVE TECHNOLOGY, 2019, 37 (05) :1515-1534
[10]  
DeCusatis C, 2013, 2013 OPTICAL FIBER COMMUNICATION CONFERENCE AND EXPOSITION AND THE NATIONAL FIBER OPTIC ENGINEERS CONFERENCE (OFC/NFOEC)