Adaptive fault-tolerant architecture and routing algorithm for reliable many-core 3D-NoC systems

被引:20
作者
Ben Ahmed, Akram [1 ]
Ben Abdallah, Abderazek [2 ]
机构
[1] Keio Univ, Dept Informat & Comp Sci, Yokohama, Kanagawa 2238522, Japan
[2] Univ Aizu, Grad Sch Comp Sci & Engn, Adapt Syst Lab, Aizu Wakamatsu, Fukushima 9658580, Japan
关键词
3D NoC; Fault-tolerance; Robustness; Architecture; Dynamic reconfiguration; Deadlock-free; NETWORKS;
D O I
10.1016/j.jpdc.2016.03.014
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
During the last few decades, Three-dimensional Network-on-Chips (3D-NoCs) have been showing their advantages against 2D-NoC architectures. This is thanks to the reduced average interconnect length and lower interconnect-power consumption inherited from Three-dimensional Integrated Circuits (3D-ICs). On the other hand, questions about their reliability is starting to arise. This issue is mainly caused by their complex nature where a single faulty transistor may cause intolerable performance degradation or even the entire system collapse. To ensure their correct functionality, 3D-NoC systems must be fault-tolerant to any short-term malfunction or permanent physical damage to ensure message delivery on time while minimizing the performance degradation as much as possible. In this paper, we present a fault-tolerant 3D-NoC architecture, called 3D-Fault-Tolerant-OASIS (3D-FTO).(1) Withthe aid of a light-weight routing algorithm, 3D-FTO manages to avoid the system failure at the presence of a large number of transient, intermittent, and permanent faults. Moreover, the proposed architecture is leveraging on reconfigurable components to handle the fault occurrence in links, input buffers, and crossbar, where the faults are more often to happen. The proposed 3D-FTO system is able to work around different kinds of faults ensuring graceful performance degradation while minimizing the additional hardware complexity and remaining power-efficient. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:30 / 43
页数:14
相关论文
共 28 条
[1]  
Ahmed A. B., 2012, 2012 IEEE 6th International Symposium on Embedded Multicore SoCs (MCSoC), P167, DOI 10.1109/MCSoC.2012.24
[2]  
[Anonymous], 2006, NETWORKS CHIPS TECHN
[3]  
Ben Abdallah A., 2006, P TJASSST2006 S SCI
[4]  
Ben Ahmed A., 2010, Proceedings of the 2010 International Conference on Broadband, Wireless Computing, Communication and Applications (BWCCA 2010), P67, DOI 10.1109/BWCCA.2010.50
[5]  
Ben Ahmed A., 2013, 7 IEEE INT S EMB MUL, P67
[6]  
Ben Ahmed A., 2012, IEEE P 3 INT C NETW
[7]   Graceful deadlock-free fault-tolerant routing algorithm for 3D Network-on-Chip architectures [J].
Ben Ahmed, Akram ;
Ben Abdallah, Abderazek .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (04) :2229-2240
[8]   Architecture and design of high-throughput, low-latency, and fault-tolerant routing algorithm for 3D-network-on-chip (3D-NoC) [J].
Ben Ahmed, Akram ;
Ben Abdallah, Abderazek .
JOURNAL OF SUPERCOMPUTING, 2013, 66 (03) :1507-1532
[9]   Designing reliable systems from unreliable components: The challenges of transistor variability and degradation [J].
Borkar, S .
IEEE MICRO, 2005, 25 (06) :10-16
[10]  
Burns A., 2009, REAL TIME SYSTEMS PR, V4th