Efficient Software-Implemented HW Fault Tolerance for TinyML Inference in Safety-critical Applications

被引：3

作者：

Sharif, Uzair ^{[1
]}

Mueller-Gritschneder, Daniel ^{[1
]}

Stahl, Rafael ^{[1
]}

Schlichtmann, Ulf ^{[1
]}

机构：

[1] Tech Univ Munich TUM, Chair Elect Design Automat, Munich, Germany

来源：

2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE | 2023年

关键词：

TinyML; safety; error detection; soft-error;

D O I：

10.23919/DATE56975.2023.10137207

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

TinyML research has mainly focused on optimizing neural network inference in terms of latency, code-size and energy-use for efficient execution on low-power micro-controller units (MCUs). However, distinctive design challenges emerge in safety-critical applications, for example in small unmanned autonomous vehicles such as drones, due to the susceptibility of off-the-shelf MCU devices to soft-errors. We propose three new techniques to protect TinyML inference against random soft errors with the target to reduce run-time overhead: one for protecting fully-connected layers; one adaptation of existing algorithmic fault tolerance techniques to depth-wise convolutions; and an efficient technique to protect the so-called epilogues within TinyML layers. Integrating these layer-wise methods, we derive a full-inference hardening solution for TinyML that achieves run-time efficient soft-error resilience. We evaluate our proposed solution on MLPerf-Tiny benchmarks. Our experimental results show that competitive resilience can be achieved compared with currently available methods, while reducing run-time overheads by similar to 120% for one fully-connected neural network (NN); similar to 20% for the two CNNs with depth-wise convolutions; and similar to 2% for standard CNN. Additionally, we propose selective hardening which reduces the incurred run-time overhead further by similar to 2x for the studied CNNs by focusing exclusively on avoiding mispredictions.

引用

页数：6

共 4 条

[1] COMPAS: Compiler-assisted Software-implemented Hardware Fault Tolerance for RISC-V
Sharif, Uzair
Mueller-Gritschneder, Daniel
Schlichtmann, Ulf
2022 11TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2022, : 80 - 83
[2] A Fine-Grained Software-Implemented DMA Fault Tolerance for SoC Against Soft Error
Xiaozhi Du
Dongyang Luo
Chaohui He
Shuhuan Liu
Journal of Electronic Testing, 2018, 34 : 717 - 733
[3] A Fine-Grained Software-Implemented DMA Fault Tolerance for SoC Against Soft Error
Du, Xiaozhi
Luo, Dongyang
He, Chaohui
Liu, Shuhuan
JOURNAL OF ELECTRONIC TESTING-THEORY AND APPLICATIONS, 2018, 34 (06): : 717 - 733
[4] Input-Domain Software Testing for Failure Probability Estimation of Safety-Critical Applications in Consideration of Past Input Sequence
Kim, Hee Eun
Son, Han Seong
Kim, Bo Gyung
Cho, Jaehyun
Shin, Sung Min
Kang, Hyun Gook
IEEE ACCESS, 2018, 6 : 8440 - 8451

← 1 →