Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications

被引:320
|
作者
Li, Guanpeng [1 ]
Hari, Siva Kumar Sastry [2 ]
Sullivan, Michael [2 ]
Tsai, Timothy [2 ]
Pattabiraman, Karthik [1 ]
Emer, Joel [2 ]
Keckler, Stephen W. [2 ]
机构
[1] Univ British Columbia, Vancouver, BC, Canada
[2] NVIDIA, Santa Clara, CA USA
来源
SC'17: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS | 2017年
基金
加拿大自然科学与工程研究理事会;
关键词
Deep Learning; Silent Data Corruption; Soft Error; Reliability;
D O I
10.1145/3126908.3126964
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Deep learning neural networks (DNNs) have been successful in solving a wide range of machine learning problems. Specialized hardware accelerators have been proposed to accelerate the execution of DNN algorithms for high-performance and energy efficiency. Recently, they have been deployed in datacenters (potentially for business-critical or industrial applications) and safety-critical systems such as self-driving cars. Soft errors caused by high-energy particles have been increasing in hardware systems, and these can lead to catastrophic failures in DNN systems. Traditional methods for building resilient systems, e.g., Triple Modular Redundancy (TMR), are agnostic of the DNN algorithm and the DNN accelerator's architecture. Hence, these traditional resilience approaches incur high overheads, which makes them challenging to deploy. In this paper, we experimentally evaluate the resilience characteristics of DNN systems (i.e., DNN software running on specialized accelerators). We find that the error resilience of a DNN system depends on the data types, values, data reuses, and types of layers in the design. Based on our observations, we propose two efficient protection techniques for DNN systems.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Soft Error Mitigation for Deep Convolution Neural Network on FPGA Accelerators
    Li, Wenshuo
    Ge, Guangjun
    Guo, Kaiyuan
    Chen, Xiaoming
    Wei, Qi
    Gao, Zhen
    Wang, Yu
    Yang, Huazhong
    2020 2ND IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2020), 2020, : 1 - 5
  • [2] Quantization-Error-Robust Deep Neural Network for Embedded Accelerators
    Jung, Youngbeom
    Kim, Hyeonuk
    Choi, Yeongjae
    Kim, Lee-Sup
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (02) : 609 - 613
  • [3] SEALing Neural Network Models in Encrypted Deep Learning Accelerators
    Zuo, Pengfei
    Hua, Yu
    Liang, Ling
    Xie, Xinfeng
    Hu, Xing
    Xie, Yuan
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1255 - 1260
  • [4] Kernel Mapping Techniques for Deep Learning Neural Network Accelerators
    Ozdemir, Sarp
    Khasawneh, Mohammad
    Rao, Smriti
    Madden, Patrick H.
    ISPD'22: PROCEEDINGS OF THE 2022 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN, 2022, : 21 - 28
  • [5] Efficient On-Line Error Detection and Mitigation for Deep Neural Network Accelerators
    Schorn, Christoph
    Guntoro, Andre
    Ascheid, Gerd
    COMPUTER SAFETY, RELIABILITY, AND SECURITY (SAFECOMP 2018), 2018, 11093 : 205 - 219
  • [6] Optimizing deep learning inference on mobile devices with neural network accelerators
    曾惜
    Xu Yunlong
    Zhi Tian
    High Technology Letters, 2019, 25 (04) : 417 - 425
  • [7] Link Bit-Error-Rate Requirement Analysis for Deep Neural Network Accelerators
    Lee, Jaewon
    Kim, Gain
    Park, Jinho
    Bae, Hyeon-Min
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [8] DNN Engine: A 28-nm Timing-Error Tolerant Sparse Deep Neural Network Processor for IoT Applications
    Whatmough, Paul N.
    Lee, Sae Kyu
    Brooks, David
    Wei, Gu-Yeon
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (09) : 2722 - 2731
  • [9] Understanding Error Propagation in GPGPU Applications
    Li, Guanpeng
    Pattabiraman, Karthik
    Cher, Chen-Yong
    Bose, Pradip
    SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2016, : 240 - 251
  • [10] FPA-DNN: A Forward Propagation Acceleration based Deep Neural Network for Ship Detection
    Wang, Feng
    Liao, Fanshu
    Zhu, Huiqing
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,