Design space exploration of neural network accelerator based on transfer learning

被引:0
作者
Wu Y. [1 ]
Zhi T. [2 ]
Song X. [2 ]
Li X. [1 ]
机构
[1] School of Computer Science, University of Science and Technology of China, Hefei
[2] State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing
基金
中国国家自然科学基金;
关键词
design space exploration (DSE); multi-task learning; neural network accelerator; transfer learning;
D O I
10.3772/j.issn.1006-6748.2023.04.009
中图分类号
学科分类号
摘要
With the increasing demand of computational power in artificial intelligence (AI) algorithms, dedicated accelerators have become a necessity. However, the complexity of hardware architectures, vast design search space, and complex tasks of accelerators have posed significant challenges. Traditional search methods can become prohibitively slow if the search space continues to be expanded. A design space exploration (DSE) method is proposed based on transfer learning, which reduces the time for repeated training and uses multi-task models for different tasks on the same processor. The proposed method accurately predicts the latency and energy consumption associated with neural network accelerator design parameters, enabling faster identification of optimal outcomes compared with traditional methods. And compared with other DSE methods by using multilayer perceptron (MLP), the required training time is shorter. Comparative experiments with other methods demonstrate that the proposed method improves the efficiency of DSE without compromising the accuracy of the results. © 2023 Inst. of Scientific and Technical Information of China. All rights reserved.
引用
收藏
页码:416 / 426
页数:10
相关论文
共 23 条
[1]  
CHEN Y H, KRISHNA T, EMER J S, Et al., Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks, IEEE Journal of Solid-State Circuits, 52, pp. 127-138, (2017)
[2]  
JOUPPI N P, YOUNG C, PATIL N, Et al., In-datacenter performance analysis of a tensor processing unit, Proceedings of the 44th Annual International Symposium on Computer Architecture, pp. 1-12, (2017)
[3]  
TALPES E, GORTI A, SACHDEV G S, Et al., Compute solution for Tesla’ s full self-driving computer, IEEE Micro, 40, pp. 25-35, (2020)
[4]  
SHARMA H, PARK J, MAHAJAN D, Et al., From high-level deep neural models to FPGAs, The 49th Annual IEEE/ ACM International Symposium on Microarchitecture, 17, pp. 1-17:12, (2016)
[5]  
ZHANG X, WANG J, ZHU C, Et al., DNNBuilder: an automated tool for building high-performance DNN hardware accelerators for FPGAs, Proceedings of the International Conference on Computer-Aided Design, (2018)
[6]  
XU P, ZHANG X, HAO C, Et al., AutoDNNchip: an automated DNN chip predictor and builder for both FPGAs and ASICs, The 2020 ACM/ SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 40-50, (2020)
[7]  
KAO S, JEONG G, KRISHNA T., ConfuciuX: autonomous hardware resource assignment for DNN accelerators using reinforcement learning, The 53rd Annual IEEE/ ACM International Symposium on Microarchitecture, pp. 622-636, (2020)
[8]  
YANG X, GAO M, LIU Q, Et al., Interstellar: using halide’ s scheduling language to analyze DNN accelerators [ C], Architectural Support for Programming Languages and Operating Systems, pp. 369-383, (2020)
[9]  
XI S L, YAO Y, BHARDWAJ K, Et al., SMAUG: end-to-end full-stack simulation infrastructure for deep learning workloads, ACM Transactions on Architecture and Code Optimization, 17, pp. 1-26, (2020)
[10]  
WU Y N, EMER J S, SZE V., Accelergy: an architecture-level energy estimation methodology for accelerator designs, Proceedings of the International Conference on Computer-Aided Design, pp. 1-8, (2019)