Smart Reconfiguration Approach for Fault-Tolerant NoC Based MPSoCs

被引:2
作者
Silveira, Jarbas [1 ]
Cortez, Paulo [1 ]
Cadore, Alan [1 ]
Mota, Rafael [1 ]
Marcon, Cesar [2 ]
Brahm, Lucas [2 ]
Fernandes, Ramon [2 ]
机构
[1] Fed Univ Ceara UFC, DETI, LESC, Fortaleza, Ceara, Brazil
[2] Pontificia Univ Catolica Rio Grande do Sul, Porto Alegre, RS, Brazil
来源
2015 28TH SYMPOSIUM ON INTEGRATED CIRCUITS AND SYSTEMS DESIGN (SBCCI) | 2015年
关键词
Fault-tolerance; NoC; MPSoC; routing methods; reconfiguration; ROUTING ALGORITHMS; NETWORK; VARIABILITY;
D O I
10.1145/2800986.2801027
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Newest technologies of integrated circuits fabrication allow billions of transistors arranged in a single chip enabling to implement a complex parallel system, which requires a high scalable and parallel communication architecture, such as a Network-on-Chip (NoC). These technologies are very close to physical limitations increasing faults in manufacture and at runtime. Thus, it is essential to provide a fault recovery mechanism for NoC operation in the presence of faults. The preprocessing of the most probable fault scenarios and flits retransmission capability enable to anticipate the calculation of deadlock-free routings, reducing the time necessary to interrupt the system in a fault occurrence and maintaining links operating with retransmission capability. This work proposes a smart decisions mechanism for errors on NoC links, which is composed of a hardware part implemented into the links and routers, and a software part implemented inside an operating system kernel of each processor. The mechanism defines thresholds where is better to reconfigure the NoC or to retransmit flits with errors. Experimental results, with several NoC sizes and some error models, suggest when is better to reconfigure the NoC and when is better to maintain some links operating with eventual faults.
引用
收藏
页数:6
相关论文
共 50 条
[21]   ComChain: A blockchain with Byzantine fault-tolerant reconfiguration [J].
Vizier, Guillaume ;
Gramoli, Vincent .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (12)
[22]   A fault-tolerant routing algorithm for NoC based on 2D Mesh [J].
Jiang, S. Y. ;
Jiang, S. S. ;
Luo, G. ;
Lu, Z. ;
Zhou, J. .
INFORMATION SCIENCE AND ELECTRONIC ENGINEERING, 2017, :35-40
[23]   Hierarchical Agents Based Fault-Tolerant and Congestion-Aware Routing for NoC [J].
Nayak, Chinmaya Kumar ;
Das, Satyabrata ;
Behera, Himnsu Sekhar .
COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 3, 2015, 33
[24]   Fault-Tolerant Mesh-Based NoC with Router-Level Redundancy [J].
Yung-Chang Chang ;
Cihun-Siyong Alex Gong ;
Ching-Te Chiu .
Journal of Signal Processing Systems, 2020, 92 :345-355
[25]   Realization of Fault-Tolerant Home Network Management Middleware with the TMO Structuring Approach and an Integration of Fault Detection and Reconfiguration Mechanisms [J].
Kim, K. H. ;
Zhou, Qian ;
Qian, Jing ;
Moon, Kyung-Deok ;
Park, Jun Hee ;
Son, Young-Sung ;
Lee, Chang-Eun ;
Ku, Tai-Yeon .
PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL SYMPOSIUM ON OBJECT/COMPONENT/SERVICE-ORIENTED REAL-TIME DISTRIBUTED COMPUTING, 2009, :188-+
[26]   A FAULT-TOLERANT ARRAY PROCESSOR DESIGNED FOR TESTABILITY AND SELF-RECONFIGURATION [J].
JAIN, A ;
MANDAVA, B ;
RAJSKI, J ;
RUMIN, NC .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1991, 26 (05) :778-788
[27]   Autonomic service reconfiguration for fault-tolerant ubiquitous computing [J].
Kim, Eun-Kyung ;
Kim, Yoonhee .
INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2007, 10 (03) :289-296
[28]   Distributed fault-tolerant ring embedding and reconfiguration in hypercubes [J].
Leu, YR ;
Kuo, SY .
IEEE TRANSACTIONS ON COMPUTERS, 1999, 48 (01) :81-88
[29]   Addressing Transient and Permanent Faults in NoC With Efficient Fault-Tolerant Deflection Router [J].
Feng, Chaochao ;
Lu, Zhonghai ;
Jantsch, Axel ;
Zhang, Minxuan ;
Xing, Zuocheng .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2013, 21 (06) :1053-1066
[30]   Efficient Fault-Tolerant Topology Reconfiguration Using a Maximum Flow Algorithm [J].
Ren, Yu ;
Liu, Leibo ;
Yin, Shouyi ;
Han, Jie ;
Wei, Shaojun .
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2015, 8 (03)