Quality-driven design of deep neural network hardware accelerators for low power CPS and IoT applications

被引:0
作者
Jan, Yahya [1 ]
Jozwiak, Lech [1 ]
机构
[1] Eindhoven Univ Technol, Fac Elect Engn, Eindhoven, Netherlands
关键词
Deep Neural Networks (DNN); Cyber-Physical System (CPS); Internet of Things (IoT); Highly-parallel DNN architectures; Design Space Exploration (DSE); Low power design techniques; GENERATION;
D O I
10.1016/j.micpro.2024.105119
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents the results of our analysis of the main problems that have to be solved in the design of highly parallel high-performance accelerators for Deep Neural Networks (DNNs) used in low power Cyber- Physical System (CPS) and Internet of Things (IoT) devices, in application areas such as smart automotive, health and smart services in social networks (Facebook, Instagram, X/Twitter, etc.). Our analysis demonstrates that to arrive a to high-quality DNN accelerator architecture, complex mutual trade-offs have to be resolved among the accelerator micro- and macro-architecture, and the corresponding memory and communication architectures, as well as among the performance, power consumption and area. Therefore, we developed a multi-processor accelerator design methodology involving an automatic design-space exploration (DSE) framework that enables a very efficient construction and analysis of DNN accelerator architectures, as well as an adequate trade-off exploitation. To satisfy the low power demands of IoT devices, we extend our quality- driven model-based multi-processor accelerator design methodology with some novel power optimization techniques at the Processor's and memory exploration stages. Our proposed power optimization techniques at the processor's exploration stage achieve up to 66.5% reduction in power consumption, while our proposed data reuse techniques avoid up to 85.92% of redundant memory accesses thereby reducing the power consumption of accelerator necessary for low-power IoT applications. Currently, we are beginning to apply this methodology with the proposed power optimization techniques to the design of low-power DNN accelerators for IoT applications.
引用
收藏
页数:13
相关论文
共 30 条
  • [1] Canis A, 2011, FPGA 11: PROCEEDINGS OF THE 2011 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, P33
  • [2] A HIGH-THROUGHPUT NEURAL NETWORK ACCELERATOR
    Chen, Tianshi
    Du, Zidong
    Sun, Ninghui
    Wang, Jia
    Wu, Chengyong
    Chen, Yunji
    Temam, Olivier
    [J]. IEEE MICRO, 2015, 35 (03) : 24 - 32
  • [3] Chen YH, 2016, ISSCC DIG TECH PAP I, V59, P262, DOI 10.1109/ISSCC.2016.7418007
  • [4] Platform-based behavior-level and system-level synthesis
    Cong, Jason
    Fan, Yiping
    Han, Guoling
    Jiang, Wei
    Zhang, Zhiru
    [J]. IEEE INTERNATIONAL SOC CONFERENCE, PROCEEDINGS, 2006, : 199 - +
  • [5] Guo Z, 2005, DES AUT TEST EUROPE, P112
  • [6] SPARK: A high-level synthesis framework for applying parallelizing compiler transformations
    Gupta, S
    Dutt, N
    Gupta, R
    Nicolau, A
    [J]. 16TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, PROCEEDINGS, 2003, : 461 - 466
  • [7] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [8] Processor architecture exploration and synthesis of massively parallel multi-processor accelerators in application to LDPC decoding
    Jan, Yahya
    Jozwiak, Lech
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2014, 38 (02) : 152 - 169
  • [9] Communication and Memory Architecture Design of Application-Specific High-End Multiprocessors
    Jan, Yahya
    Jozwiak, Lech
    [J]. VLSI DESIGN, 2012,
  • [10] In-Datacenter Performance Analysis of a Tensor Processing Unit
    Jouppi, Norman P.
    Young, Cliff
    Patil, Nishant
    Patterson, David
    Agrawal, Gaurav
    Bajwa, Raminder
    Bates, Sarah
    Bhatia, Suresh
    Boden, Nan
    Borchers, Al
    Boyle, Rick
    Cantin, Pierre-luc
    Chao, Clifford
    Clark, Chris
    Coriell, Jeremy
    Daley, Mike
    Dau, Matt
    Dean, Jeffrey
    Gelb, Ben
    Ghaemmaghami, Tara Vazir
    Gottipati, Rajendra
    Gulland, William
    Hagmann, Robert
    Ho, C. Richard
    Hogberg, Doug
    Hu, John
    Hundt, Robert
    Hurt, Dan
    Ibarz, Julian
    Jaffey, Aaron
    Jaworski, Alek
    Kaplan, Alexander
    Khaitan, Harshit
    Killebrew, Daniel
    Koch, Andy
    Kumar, Naveen
    Lacy, Steve
    Laudon, James
    Law, James
    Le, Diemthu
    Leary, Chris
    Liu, Zhuyuan
    Lucke, Kyle
    Lundin, Alan
    MacKean, Gordon
    Maggiore, Adriana
    Mahony, Maire
    Miller, Kieran
    Nagarajan, Rahul
    Narayanaswami, Ravi
    [J]. 44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017), 2017, : 1 - 12