Fault-Tolerant Mesh-Based NoC with Router-Level Redundancy

被引:7
作者
Chang, Yung-Chang [1 ]
Gong, Cihun-Siyong Alex [2 ,3 ,4 ]
Chiu, Ching-Te [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
[2] Chang Gung Univ, Coll Engn, Dept Elect Engn, Taoyuan, Taiwan
[3] Chang Gung Univ, Coll Engn, Green Technol Res Ctr, Portable Energy Syst Grp, Taoyuan, Taiwan
[4] Chang Gung Mem Hosp, Dept Ophthalmol, Taoyuan, Taiwan
来源
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2020年 / 92卷 / 04期
关键词
Fault tolerance; Interconnections; Integrated circuit reliability; Network topology; ON-CHIP; NETWORK; DESIGN;
D O I
10.1007/s11265-019-01476-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The aggressively scaled CMOS technology is increasingly threatening the dependability of network-on-chips (NoCs) architecture. In a mesh-based NoC, a faulty router or broken link may isolate a well functional processing element (PE). Also, a set of faulty routers may form isolated regions, which can degrade the design. In this paper, we propose a router-level redundancy (RLR) fault-tolerant scheme that differs from the traditional microarchitecture-level redundancy (MLR) approach to relieve the problem of isolated PE and isolated region. By simply adding one spare router within each router set in a mesh, RLR can be created and connection paths between adjacent routers can be diversified. To exploit this extra resource, two reconfiguration algorithms are demonstrated to detour observed faulty routers/links. The proposed RLR fault-tolerant scheme can tolerate at most one faulty router within a router set. After the reconfiguration, the original mesh topology is maintained. As a result, the proposed architecture does not need any support from the network layer routing algorithms. The scheme has been evaluated based on the three fault-tolerant metrics: reliability, mean time to failure (MTTF), and yield. The experimental results show that the performance RLR increases as the size of NoC grows; however, the relative connection cost decreases at the same time. This characteristic makes our architecture suitable for large-scale NoC designs.
引用
收藏
页码:345 / 355
页数:11
相关论文
共 45 条
[41]   Floorplan Optimization of Fat-Tree-Based Networks-on-Chip for Chip Multiprocessors [J].
Wang, Zhehui ;
Xu, Jiang ;
Wu, Xiaowen ;
Ye, Yaoyao ;
Zhang, Wei ;
Nikdast, Mahdi ;
Wang, Xuan ;
Wang, Zhe .
IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (06) :1445-1458
[42]   A Survey on Design Approaches to Circumvent Permanent Faults in Networks-on-Chip [J].
Werner, Sebastian ;
Navaridas, Javier ;
Lujan, Mikel .
ACM COMPUTING SURVEYS, 2016, 48 (04)
[43]  
Yu Q., 2011, EXPLOITING INHERENT, P105
[44]  
Yung-Chang Chang, 2011, 2011 16th Asia and South Pacific Design Automation Conference, ASP-DAC 2011, P431, DOI 10.1109/ASPDAC.2011.5722228
[45]   On Topology Reconfiguration for Defect-Tolerant NoC-Based Homogeneous Manycore Systems [J].
Zhang, Lei ;
Han, Yinhe ;
Xu, Qiang ;
Li, Xiao wei ;
Li, Huawei .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2009, 17 (09) :1173-1186