Modeling and Analysis of Fault-tolerant Distributed Memories for Networks-on-Chip

被引:0
作者
BanaiyanMofrad, Abbas [1 ]
Dutt, Nikil [1 ]
Girao, Gustavo [2 ]
机构
[1] Univ Calif Irvine, Ctr Embedded Comp Syst, Irvine, CA 92697 USA
[2] Univ Fed Rio Grande do Sul, Inst Informat, Porto Alegre, RS, Brazil
来源
DESIGN, AUTOMATION & TEST IN EUROPE | 2013年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Advances in technology scaling increasingly make Network-on-Chips (NoCs) more susceptible to failures that cause various reliability challenges. With increasing area occupied by different on-chip memories, strategies for maintaining fault-tolerance of distributed on-chip memories become a major design challenge. We propose a system-level design methodology for scalable fault-tolerance of distributed on-chip memories in NoCs. We introduce a novel reliability clustering model for fault-tolerance analysis and shared redundancy management of onchip memory blocks. We perform extensive design space exploration applying the proposed reliability clustering on a block-redundancy fault-tolerant scheme to evaluate the tradeoffs between reliability, performance, and overheads. Evaluations on a 64-core chip multiprocessor (CMP) with an 8x8 mesh NoC show that distinct strategies of our case study may yield up to 20% improvements in performance gains and 25% improvement in energy savings across different benchmarks, and uncover interesting design configurations.
引用
收藏
页码:1605 / 1608
页数:4
相关论文
共 50 条
[41]   DISTRIBUTED FAULT-TOLERANT ROUTING IN KAUTZ NETWORKS [J].
CHIANG, WK ;
CHEN, RJ .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1994, 20 (01) :99-106
[42]   Fault-tolerant Traffic-aware Routing Algorithm for 3-D Photonic Networks-on-chip [J].
Meyer, Michael Conrad ;
Wang, Yu ;
Watanabe, Takahiro .
2019 IEEE 13TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2019), 2019, :172-179
[43]   FAULT-TOLERANT ROUTING IN DISTRIBUTED LOOP NETWORKS [J].
MUKHOPADHYAYA, K ;
SINHA, BP .
IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (12) :1452-1456
[44]   Distributed Fault-Tolerant Quality of Wireless Networks [J].
Llewellyn, Larry C. ;
Hopkinson, Kenneth M. ;
Graham, Scott R. .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2011, 10 (02) :175-190
[45]   FAULT-TOLERANT SEMICONDUCTOR MEMORIES [J].
SARRAZIN, DB ;
MALEK, M .
COMPUTER, 1984, 17 (08) :49-56
[46]   FAULT-TOLERANT ASSOCIATE MEMORIES [J].
FUJIWARA, E ;
TANAKA, T .
SYSTEMS AND COMPUTERS IN JAPAN, 1995, 26 (07) :1-12
[47]   DISTRIBUTED AND FAULT-TOLERANT COMPUTATION FOR RETRIEVAL TASKS USING DISTRIBUTED ASSOCIATIVE MEMORIES [J].
CHAR, JM ;
CHERKASSKY, V ;
WECHSLER, H ;
ZIMMERMAN, GL .
IEEE TRANSACTIONS ON COMPUTERS, 1988, 37 (04) :484-490
[48]   A Q-Learning-Based Fault-Tolerant and Congestion-Aware Adaptive Routing Algorithm for Networks-on-Chip [J].
Liu, Yi ;
Guo, Rujia ;
Xu, Changqing ;
Weng, Xiaodong ;
Yang, Yintang .
IEEE EMBEDDED SYSTEMS LETTERS, 2022, 14 (04) :203-206
[49]   Reconciling fault-tolerant distributed computing and systems-on-chip [J].
Fuegger, Matthias ;
Schmid, Ulrich .
DISTRIBUTED COMPUTING, 2012, 24 (06) :323-355
[50]   Reconciling fault-tolerant distributed computing and systems-on-chip [J].
Matthias Függer ;
Ulrich Schmid .
Distributed Computing, 2012, 24 :323-355