Dynamic Triple Modular Redundancy in Interleaved Hardware Threads: An Alternative Solution to Lockstep Multi-Cores for Fault-Tolerant Systems

被引:2
作者
Barbirotta, Marcello [1 ]
Menichelli, Francesco [1 ]
Cheikh, Abdallah [1 ]
Mastrandrea, Antonio [1 ]
Angioli, Marco [1 ]
Olivieri, Mauro [1 ]
机构
[1] Sapienza Univ Rome, Dept Informat Engn Elect & Telecommun DIET, I-00184 Rome, Italy
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Termination of employment; Hardware; Circuit faults; Fault tolerant systems; Computer architecture; Registers; Digital integrated circuits; Field programmable gate arrays; Microprocessors; Radiation hardening (electronics); Redundancy; digital integrated circuits; fault detection; fault tolerant computing; field programmable gate arrays; microprocessors; multithreading; radiation hardening (electronics); redundancy; robustness; DESIGN;
D O I
10.1109/ACCESS.2024.3425579
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Over the years, significant work has been done on high-integrity systems, such as those found in cars, satellites and aircrafts, to minimize the risk that a logic fault causes a system failure, thus having functional safety as a key requirement. In this study, we employ an innovative approach to harness the benefits of both Dual Modular Redundancy and Triple Modular Redundancy techniques within an Interleaved-Multi-Threading microprocessor core, by means of a microarchitecture design capable of dynamically switching from Dual Modular Redundancy to Triple Modular Redundancy in case of faults. We explain the quantitative results obtained from an extensive fault injection simulation campaign on the fault tolerant core compared with its previous version regarding fault tolerant capabilities. The results show that in several application cases the fault resilience improvement and the hardware and timing overhead are better compared to the lockstep-based dual core approach. The proposed technique achieves 98,6% fault mitigation at the expense of only 4 clock cycles for roll-back overhead, with no checkpointing redundancy.
引用
收藏
页码:95720 / 95735
页数:16
相关论文
共 41 条
  • [11] Downsizing Effects on Micro and Nano Comb Drives
    Buzzin, Alessio
    Rossi, Andrea
    Giovine, Ennio
    de Cesare, Giampiero
    Belfiore, Nicola Pio
    [J]. ACTUATORS, 2022, 11 (03)
  • [12] Carmichael C., 2001, P MIL AER APPL PROGR, P1
  • [13] Cheikh Abdallah, 2019, Applications in Electronics Pervading Industry, Environment and Society. APPLEPIES 2017. LNEE 512, P89, DOI 10.1007/978-3-319-93082-4_12
  • [14] Cheikh A., 2019, Applepies, V627, P505
  • [15] Analyzing Lockstep Dual-Core ARM Cortex-A9 Soft Error Mitigation in FreeRTOS Applications
    de Oliveira, Adria Barros
    Rodrigues, Gennaro Severino
    Kastensmidt, Fernanda Lima
    [J]. 2017 30TH SYMPOSIUM ON INTEGRATED CIRCUITS AND SYSTEMS DESIGN (SBCCI 2017): CHOP ON SANDS, 2017, : 84 - 89
  • [16] Transient Fault Models and AVF Estimation Revisited
    George, Nishant J.
    Elks, Carl R.
    Johnson, Barry W.
    Lach, John
    [J]. 2010 IEEE-IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS DSN, 2010, : 477 - 486
  • [17] Reduced-Order Observer-Based Finite Time Fault Estimation for Switched Systems With Lager and Fast Time Varying Fault
    Han, Jian
    Liu, Xiuhua
    Wei, Xinjiang
    Zhu, Xiaodan
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (01) : 350 - 354
  • [18] Dynamic Output Feedback Fault Tolerant Control for Switched Fuzzy Systems With Fast Time Varying and Unbounded Faults
    Han, Jian
    Liu, Xiuhua
    Xie, Xiangpeng
    Wei, Xinjiang
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (09) : 3185 - 3196
  • [19] Herault T, 2015, COMPUT COMMUN NETW S, P1, DOI 10.1007/978-3-319-20943-2
  • [20] The Arm Triple Core Lock-Step (TCLS) Processor
    Iturbe, Xabier
    Venu, Balaji
    Ozer, Emre
    Poupat, Jean-Luc
    Gimenez, Gregoire
    Zurek, Hans-Ulrich
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2019, 36 (03):