Fault Tolerance Through Redundant Execution on COTS Multicores: Exploring Trade-offs

被引:5
|
作者
Shen, Yanyan [1 ]
Heiser, Gernot
Elphinstone, Kevin
机构
[1] UNSW Sydney, Sydney, NSW, Australia
来源
2019 49TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN 2019) | 2019年
关键词
seL4; microkernel; SEU; replication; fault tolerance; SINGLE-EVENT-UPSETS; INDUCED SOFT ERRORS;
D O I
10.1109/DSN.2019.00031
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High availability and integrity are paramount in systems deployed in life- and mission-critical scenarios. Such fault-tolerance can be achieved through redundant co-execution (RCoE) on replicated hardware, now cheaply available with multicore processors. RCoE replicates almost all software, including OS kernel, drivers, and applications, achieving a sphere of replication that covers everything except the minimal interfaces to non-replicated peripherals. We complement our original, loosely-coupled RCoE with a closely-coupled version that improves transparency of replication to application code, and investigate the functionality, performance and vulnerability trade-offs.
引用
收藏
页码:188 / 200
页数:13
相关论文
共 8 条
  • [1] Performance and Fault Tolerance Trade-offs in Sharded Permissioned Blockchains
    Mao, Chunyu
    Anh-Duong Nguyen
    Golab, Wojciech
    2020 IEEE INTERNATIONAL CONFERENCE ON BLOCKCHAIN AND CRYPTOCURRENCY (IEEE ICBC), 2020,
  • [2] Performance and Fault Tolerance Trade-offs in Sharded Permissioned Blockchains
    Mao, Chunyu
    Anh-Duong Nguyen
    Golab, Wojciech
    2021 3RD CONFERENCE ON BLOCKCHAIN RESEARCH & APPLICATIONS FOR INNOVATIVE NETWORKS AND SERVICES (BRAINS), 2021, : 185 - 192
  • [3] Exploring Design Trade-offs in Fault-Tolerant Behavioral Hardware Accelerators
    Zhu, Zhiqi
    Taher, Farah Naz
    Schafer, Benjamin Carrion
    GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 291 - 294
  • [4] Analysis of Trade-offs in Fault-Tolerant Distributed Computing and Replicated Databases
    Gorbenko, Anatoliy
    Karpenko, Andrii
    Tarasyuk, Olga
    2020 IEEE 11TH INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS, SERVICES AND TECHNOLOGIES (DESSERT): IOT, BIG DATA AND AI FOR A SAFE & SECURE WORLD AND INDUSTRY 4.0, 2020, : 1 - 6
  • [5] Area, Throughput, and Power Trade-Offs for FPGA- and ASIC-Based Execution Stream Compression
    Mera, Maria Isabel
    Caplan, Jonah
    Mozafari, Seyyed Hasan
    Meyer, Brett H.
    Milder, Peter
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2017, 16 (04)
  • [6] FAULT TOLERANCE TASK EXECUTION THROUGH COOPERATIVE COMPUTING IN GRID
    Goraya, Major Singh
    Kaur, Lakhwinder
    PARALLEL PROCESSING LETTERS, 2013, 23 (01)
  • [7] Fault tolerance through re-execution in multiscalar architecture
    Rashid, F
    Saluja, KK
    Ramanathan, P
    DSN 2000: INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2000, : 482 - 491
  • [8] Fault tolerant internet computing: Benchmarking and modelling trade-offs between availability, latency and consistency
    Gorbenko, Anatoliy
    Romanovsky, Alexander
    Tarasyuk, Olga
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2019, 146