Improving Yield and Reliability of Chip Multiprocessors

被引:0
作者
Pan, Abhisek [1 ]
Khan, Omer [1 ]
Kundu, Sandip [1 ]
机构
[1] Univ Massachusetts, Amherst, MA 01003 USA
来源
DATE: 2009 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, VOLS 1-3 | 2009年
关键词
yield; reliability; micorarchitecture; multiprocessors;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An increasing number of hardware failures can be attributed to device reliability problems that cause partial system failure or shutdown. In this paper we propose a scheme for improving reliability of a homogeneous chip multiprocessor (CMP) that also serves to improve manufacturing yield. Our solution centers on exploiting the natural redundancy that already exists in multi-core systems by using services from other cores for functional units that are defective in a faulty core. A micro-architectural modification allows a core on a CMP to use another core as a coprocessor to service any instruction that the former cannot execute correctly. This service is accessed to improve yield and reliability, but at the cost of some loss of performance. In order to quantify this loss we have used a cycle-accurate simulator to simulate the performance of a dual-core system with one or two cores sustaining partial failure. Our results indicate that when a large and sparingly-used unit such as a floating point arithmetic unit fails in a core, even for a floating point intensive benchmark, we can continue to run each faulty core with help from companion cores with as little as 10% impact to performance and less than 1% area overhead.
引用
收藏
页码:490 / 495
页数:6
相关论文
共 50 条
[21]   TSV Built-In Self-Repair Architecture for Improving the Yield and Reliability of HBM [J].
Lee, Youngkwang ;
Han, Donghyun ;
Kang, Sungho .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (04) :578-590
[22]   Effectiveness of yield-estimation and reliability-prediction based on wafer test-chip measurements [J].
Hansen, CK .
ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM - 1997 PROCEEDINGS: THE INTERNATIONAL SYMPOSIUM ON PRODUCT QUALITY & INTEGRITY, 1997, :142-148
[23]   Effect of TSV presence on FEOL yield and reliability [J].
Kauerauf, Thomas ;
Branka, Anna ;
Croes, Kristof ;
Redolfi, Augusto ;
Civale, Yann ;
Torregiani, Cristina ;
Groeseneken, Guido ;
Beyne, Erik .
2013 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM (IRPS), 2013,
[24]   Single chip bumping and reliability for flip chip processes [J].
Klein, M ;
Oppermann, H ;
Kalicki, R ;
Aschenbrenner, R ;
Reichl, H .
MICROELECTRONICS RELIABILITY, 1999, 39 (09) :1389-1397
[25]   Evaluating the Impact of Job Scheduling and Power Management on Processor Lifetime for Chip Multiprocessors [J].
Coskun, Ayse K. ;
Strong, Richard ;
Tullsen, Dean M. ;
Rosing, Tajana Simunic .
SIGMETRICS/PERFORMANCE'09, PROCEEDINGS OF THE 2009 JOINT INTERNATIONAL CONFERENCE ON MEASUREMENT AND MODELING OF COMPUTER SYSTEMS, 2009, 37 (01) :169-180
[26]   Characterizing Soft Error Vulnerability of Cache Coherence Protocols for Chip-Multiprocessors [J].
Zheng, Chuanlei ;
Wang, Shuai .
PROCEEDINGS OF THE 2014 IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT TOLERANCE IN VLSI AND NANOTECHNOLOGY SYSTEMS (DFTS), 2014, :15-20
[27]   Improving Reliability through Nitrogen Purge of Carriers [J].
van Roijen, Raymond ;
Amanda, Aurelia ;
Ayala, Javier ;
Morgenfeld, Laura ;
La Rosa, Giuseppe .
2015 26TH ANNUAL SEMI ADVANCED SEMICONDUCTOR MANUFACTURING CONFERENCE (ASMC), 2015, :405-407
[28]   Thermal Optimization in Network-on-Chip-Based 3D Chip Multiprocessors Using Dynamic Programming Networks [J].
Dahir, Nizar ;
Al-Dujaily, Ra'ed ;
Mak, Terrence ;
Yakovlev, Alex .
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2014, 13
[29]   Yield and Reliability Challenges at 7nm and Below [J].
Strojwas, Andrzej J. ;
Doong, Kelvin ;
Ciplickas, Dennis .
PROCEEDINGS OF THE 2019 26TH INTERNATIONAL CONFERENCE MIXED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS (MIXDES 2019), 2019, :52-55
[30]   On the Relationship between Semiconductor Manufacturing Volume, Yield, and Reliability [J].
Siddiqui, Jeffrey ;
Ortega, John ;
Albus, Brian .
2017 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM (IRPS), 2017,