Enabling Overclocking through Algorithm-Level Error Detection

被引:9
|
作者
Marty, Thibaut [1 ]
Yuki, Tomofumi [1 ]
Derrien, Steven [1 ]
机构
[1] Univ Rennes, INRIA, CNRS, IRISA, Rennes, France
来源
2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018) | 2018年
关键词
FAULT-TOLERANCE;
D O I
10.1109/FPT.2018.00034
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we propose a technique for improving the efficiency of hardware accelerators based on timing speculation (overclocking) and fault tolerance. We augment the accelerator with a lightweight error detection mechanism to protect against timing errors, enabling aggressive timing speculation. We demonstrate the validity of our approach for the convolution layers in convolutional neural networks. We present an implementation of a fault-tolerant convolution layer accelerator combined with the lightweight error detection. The error detection mechanism we have developed works at the algorithm-level, utilizing algebraic properties of the computation, allowing the full implementation to be realized using High-Level Synthesis tools. Our prototype on ZC706 demonstrated 68%-77% higher throughput with negligible overhead.
引用
收藏
页码:177 / 184
页数:8
相关论文
共 14 条
  • [1] Safe Overclocking for CNN Accelerators Through Algorithm-Level Error Detection
    Marty, Thibaut
    Yuki, Tomofumi
    Derrien, Steven
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (12) : 4777 - 4790
  • [2] Algorithm Level Error Detection in Low Voltage Systolic Array
    Safarpour, Mehdi
    Inanlou, Reza
    Silven, Olli
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (02) : 569 - 573
  • [3] Soft Error Detection through Low-level Re-execution
    De Blaere, Brent
    Vankeirsbilck, Jens
    Boydens, Jeroen
    2021 5TH INTERNATIONAL CONFERENCE ON SYSTEM RELIABILITY AND SAFETY (ICSRS 2021), 2021, : 181 - 189
  • [4] Algorithm level re-computing using implementation diversity: A register transfer level concurrent error detection technique
    Karri, R
    Wu, KJ
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2002, 10 (06) : 864 - 875
  • [5] Utilizing Parity Checking to Optimize Soft Error Detection Through Low-Level Reexecution
    De Blaere, Brent
    Vankeirsbilck, Jens
    Boydens, Jeroen
    IEEE TRANSACTIONS ON RELIABILITY, 2023, 72 (04) : 1355 - 1366
  • [6] Algorithm level recomputing using allocation diversity: A register transfer level approach to time redundancy-based concurrent error detection
    Wu, KJ
    Karri, R
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2002, 21 (09) : 1077 - 1087
  • [7] An Algorithm Based Concurrent Error Detection Scheme for AES
    Zhang, Chang N.
    Yu, Qian
    Liu, Xiao Wei
    CRYPTOLOGY AND NETWORK SECURITY, 2010, 6467 : 31 - 42
  • [8] Multi-level checkpointing and silent error detection for linear workflows
    Benoit, Anne
    Cavelan, Aurelien
    Robert, Yves
    Sun, Hongyang
    JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 28 : 398 - 415
  • [9] Online Error Detection Through Trace Infrastructure in ARM Microprocessors
    Pena-Fernandez, M.
    Lindoso, A.
    Entrena, L.
    Garcia-Valderas, M.
    Morilla, Y.
    Martin-Holgado, P.
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2019, 66 (07) : 1457 - 1464
  • [10] Cost-Effective Error Detection Through Mersenne Modulo Shadow Datapaths
    Campbell, Keith
    Lin, Chen-Hsuan
    Chen, Deming
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (06) : 1056 - 1069