On-Chip Deep Neural Network Storage with Multi-Level eNVM

被引:19
|
作者
Donato, Marco [1 ]
Reagen, Brandon [1 ]
Pentecost, Lillian [1 ]
Gupta, Udit [1 ]
Brooks, David [1 ]
Wei, Gu-Yeon [1 ]
机构
[1] Harvard Univ, Cambridge, MA 02138 USA
来源
2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC) | 2018年
关键词
D O I
10.1145/3195970.3196083
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the biggest performance bottlenecks of today's neural network (NN) accelerators is off-chip memory accesses [11]. In this paper, we propose a method to use multi-level, embedded nonvolatile memory (eNVM) to eliminate all off-chip weight accesses. The use of multi-level memory cells increases the probability of faults. Therefore, we co-design the weights and memories such that their properties complement each other and the faults result in no noticeable NN accuracy loss. In the extreme case, the weights in fully connected layers can be stored using a single transistor. With weight pruning and clustering, we show our technique reduces the memory area by over an order of magnitude compared to an SRAM baseline. In the case of VGG16 (130M weights), we are able to store all the weights in 4.9 mm(2), well within the area allocated to SRAM in modern NN accelerators [6].
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Multi-level fusion with deep neural networks for multimodal sentiment classification
    Zhang Guangwei
    Zhao Bing
    Li Ruifan
    TheJournalofChinaUniversitiesofPostsandTelecommunications, 2022, 29 (03) : 25 - 33
  • [42] Robust multi-level current-mode on-chip interconnect signaling in the presence of process variations
    Venkatraman, V
    Burleson, W
    6TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, PROCEEDINGS, 2005, : 522 - 527
  • [43] Interconnect Area, Delay and Area-Delay Optimization for Multi-level Signaling On-Chip Bus
    Ching, Mai Y.
    Boon, Ang T.
    Yeong, Chin K.
    Rokhani, Fakhrul Z.
    PROCEEDINGS OF THE 2010 IEEE ASIA PACIFIC CONFERENCE ON CIRCUIT AND SYSTEM (APCCAS), 2010, : 1143 - 1146
  • [44] A Dual-Mode Weight Storage Analog Neural Network Platform for On-Chip Applications
    Maliuk, Dzmitry
    Makris, Yiorgos
    2012 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 2012), 2012,
  • [45] Neural Network based On-Chip Thermal Simulator
    Kumar, Pratyush
    Atienza, David
    2010 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, 2010, : 1599 - 1602
  • [46] Local cluster neural network on-chip training
    Zhang, Liang
    Sitte, Joaquin
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 29 - +
  • [47] On-Chip Memory Technology Design Space Explorations for Mobile Deep Neural Network Accelerators
    Li, Haitong
    Bhargava, Mudit
    Whatmough, Paul N.
    Wong, H-S Philip
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [48] Hierarchical Deep Network for Group Discovery and Multi-level Activity Recognition
    Goyal, Ashish
    Bhargava, Neha
    Chaudhuri, Subhasis
    Velmurugan, Rajbabu
    ELEVENTH INDIAN CONFERENCE ON COMPUTER VISION, GRAPHICS AND IMAGE PROCESSING (ICVGIP 2018), 2018,
  • [49] Salient Object Detection Based on Deep Multi-level Cascade Network
    Sun, Dengdi
    Wu, Hang
    Ding, Zhuanlian
    Li, Sheng
    Luo, Bin
    ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, 2020, 11691 : 86 - 95
  • [50] A Multi-level Artificial Neural Network for Gasoline Demand Forecasting of Iran
    Kazemi, A.
    Shakouri G, H.
    Mehregan, M. R.
    Taghizadeh, M. R.
    Menhaj, M. B.
    Foroughi.A, A.
    SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 61 - +