RRAMedy: Protecting ReRAM-based Neural Network from Permanent and Soft Faults During Its Lifetime

被引:33
作者
Li, Wen [1 ,2 ]
Wang, Ying [1 ,2 ]
Li, Huawei [1 ,2 ,3 ]
Li, Xiaowei [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, SKLCA, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Peng Cheng Lab, Shenzhen, Peoples R China
来源
2019 IEEE 37TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2019) | 2019年
基金
中国国家自然科学基金;
关键词
LOW-POWER;
D O I
10.1109/ICCD46524.2019.00020
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The emerging memristor technology is considered a promising solution to the edge-oriented deep learning and neuromorphic processor chips because it enables power-efficient Computing-in-Memory (CiM) and normally-off architecture simultaneously. However, as the analog nature and the immature nano-scale fabrication technology, the memristive cells suffer from manufacturing defects, process variations and aging-induced variations, which may incur system and function failures in applications. How to detect and rescue from the permanent and soft faults poses a significant challenge to the edge ReRAM-based deep learning or neuromorphic chips. In this work, we propose an edge-cloud collaborative framework, RRAMedy, to achieve in-situ fault detection and network remedy for memristor-based neural accelerators. In this framework, we present Adversarial Example Testing, a lifetime on-device fault detection technique, which can accurately detect defected cells and memristor soft faults with high probability and at a low cost. Furthermore, the model accuracy can be restored by the proposed edge-cloud collaborative fault-masking retraining and model updating mechanism with a minimized edge-cloud communication overhead. The experimental results show that RRAMedy can effectively detect the memristor permanent and soft faults, protecting the neural accelerator from accuracy and performance degradation in its life cycle.
引用
收藏
页码:91 / 99
页数:9
相关论文
共 24 条
[1]  
[Anonymous], 2017, NIPS
[2]  
[Anonymous], CMOS VARIABILITY VAR
[3]   RRAM Defect Modeling and Failure Analysis Based on March Test and a Novel Squeeze-Search Scheme [J].
Chen, Ching-Yi ;
Shih, Hsiu-Chuan ;
Wu, Cheng-Wen ;
Lin, Chih-He ;
Chiu, Pi-Feng ;
Sheu, Shyh-Shyuan ;
Chen, Frederick T. .
IEEE TRANSACTIONS ON COMPUTERS, 2015, 64 (01) :180-190
[4]  
Chen LR, 2017, DES AUT TEST EUROPE, P19, DOI 10.23919/DATE.2017.7926952
[5]   PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory [J].
Chi, Ping ;
Li, Shuangchen ;
Xu, Cong ;
Zhang, Tao ;
Zhao, Jishen ;
Liu, Yongpan ;
Wang, Yu ;
Xie, Yuan .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :27-39
[6]  
Chu L.-C., 1990, IJCNN International Joint Conference on Neural Networks (Cat. No.90CH2879-5), P639, DOI 10.1109/IJCNN.1990.137773
[7]  
Dong X, 2011, THESIS
[8]  
Goodfellow I, 2015, STAT-US
[9]   Test and Reliability of Emerging Non-Volatile Memories [J].
Hamdioui, Said ;
Pouyan, Peyman ;
Li, Huawei ;
Wang, Ying ;
Raychowdhur, Arijit ;
Yoon, Insik .
2017 IEEE 26TH ASIAN TEST SYMPOSIUM (ATS), 2017, :170-178
[10]   Self-Adaptive Write Circuit for Low-Power and Variation-Tolerant Memristors [J].
Jo, Kwan-Hee ;
Jung, Chul-Moon ;
Min, Kyeong-Sik ;
Kang, Sung-Mo .
IEEE TRANSACTIONS ON NANOTECHNOLOGY, 2010, 9 (06) :675-678