Algorithm/Architecture Co-Design for Energy-Efficient Acceleration of Multi-Task DNN

被引:0
作者
Shin, Jaekang [1 ]
Choi, Seungkyu [1 ]
Ra, Jongwoo [1 ]
Kim, Lee -Sup [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea
来源
PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022 | 2022年
基金
新加坡国家研究基金会;
关键词
D O I
10.1145/3489517.3530455
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-world AI applications, such as augmented reality or autonomous driving, require processing multiple CV tasks simultaneously. However, the enormous data size and the memory footprint have been a crucial hurdle for deep neural networks to be applied in resourceconstrained devices. To solve the problem, we propose an algorithm/architecture co-design. The proposed algorithmic scheme, named SqueeD, reduces per-task weight and activation size by 21.9x and 2.1x, respectively, by sharing those data between tasks. Moreover, we design architecture and dataflow to minimize DRAM access by fully utilizing benefits from SqueeD. As a result, the proposed architecture reduces the DRAM access increment and energy consumption increment per task by 2.2x and 1.3x, respectively.
引用
收藏
页码:253 / 258
页数:6
相关论文
共 21 条
  • [1] [Anonymous], 2010, Caltech-ucsd birds 200
  • [2] CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories
    Balasubramonian, Rajeev
    Kahng, Andrew B.
    Muralimanohar, Naveen
    Shafiee, Ali
    Srinivas, Vaishnav
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (02)
  • [3] Bengio Y., 2013, arXiv
  • [4] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [5] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
    Chen, Yu-Hsin
    Krishna, Tushar
    Emer, Joel S.
    Sze, Vivienne
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
  • [6] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [7] Rich feature hierarchies for accurate object detection and semantic segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
  • [8] Guo DM, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P4884
  • [9] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [10] Horowitz M, 2014, ISSCC DIG TECH PAP I, V57, P10, DOI 10.1109/ISSCC.2014.6757323