Improving Yield and Reliability of Chip Multiprocessors

被引:0
|
作者
Pan, Abhisek [1 ]
Khan, Omer [1 ]
Kundu, Sandip [1 ]
机构
[1] Univ Massachusetts, Amherst, MA 01003 USA
来源
DATE: 2009 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, VOLS 1-3 | 2009年
关键词
yield; reliability; micorarchitecture; multiprocessors;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An increasing number of hardware failures can be attributed to device reliability problems that cause partial system failure or shutdown. In this paper we propose a scheme for improving reliability of a homogeneous chip multiprocessor (CMP) that also serves to improve manufacturing yield. Our solution centers on exploiting the natural redundancy that already exists in multi-core systems by using services from other cores for functional units that are defective in a faulty core. A micro-architectural modification allows a core on a CMP to use another core as a coprocessor to service any instruction that the former cannot execute correctly. This service is accessed to improve yield and reliability, but at the cost of some loss of performance. In order to quantify this loss we have used a cycle-accurate simulator to simulate the performance of a dual-core system with one or two cores sustaining partial failure. Our results indicate that when a large and sparingly-used unit such as a floating point arithmetic unit fails in a core, even for a floating point intensive benchmark, we can continue to run each faulty core with help from companion cores with as little as 10% impact to performance and less than 1% area overhead.
引用
收藏
页码:490 / 495
页数:6
相关论文
共 50 条
  • [1] A Hardware Framework for Yield and Reliability Enhancement in Chip Multiprocessors
    Pan, Abhisek
    Rodrigues, Rance
    Kundu, Sandip
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2015, 14 (01)
  • [2] Energy-aware code replication for improving reliability in embedded chip multiprocessors
    Chen, Guilin
    Ozturk, Ozcan
    Chen, Guangyu
    Kandemir, Mahmut
    IEEE INTERNATIONAL SOC CONFERENCE, PROCEEDINGS, 2006, : 77 - +
  • [3] Energy-aware computation duplication for improving reliability in embedded chip multiprocessors
    Chen, G.
    Kandemir, M.
    Li, F.
    ASP-DAC 2006: 11TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, PROCEEDINGS, 2006, : 134 - 139
  • [4] Dynamic Lifetime Reliability Management for Chip Multiprocessors
    Moghaddam, Milad Ghorbani
    Ababei, Cristinel
    IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (04): : 952 - 958
  • [5] Maestro: Orchestrating Lifetime Reliability in Chip Multiprocessors
    Feng, Shuguang
    Gupta, Shantanu
    Ansari, Amin
    Mahlke, Scott
    HIGH PERFORMANCE EMBEDDED ARCHITECTURES AND COMPILERS, PROCEEDINGS, 2010, 5952 : 186 - 200
  • [6] Improving GaAs chip yield and enhancing reliability of GaAs devices
    Prasad, K
    ELECTRICALLY BASED MICROSTRUCTURAL CHARACTERIZATION III, 2002, 699 : 257 - 268
  • [7] Compiler Directed Network-on-Chip Reliability Enhancement for Chip Multiprocessors
    Ozturk, Ozcan
    Kandemir, Mahmut
    Irwin, Mary J.
    Narayanan, H. K.
    ACM SIGPLAN NOTICES, 2010, 45 (04) : 85 - 94
  • [8] Compiler Directed Network-on-Chip Reliability Enhancement for Chip Multiprocessors
    Ozturk, Ozcan
    Kandemir, Mahmut
    Irwin, Mary J.
    Narayanan, H. K.
    LCTES 10-PROCEEDINGS OF THE ACM SIGPLAN/SIGBED 2010 CONFERENCE ON LANGUAGES, COMPILERS, & TOOLS FOR EMBEDDED SYSTEMS, 2010, : 85 - 94
  • [9] Reliability-aware core partitioning in chip multiprocessors
    Oz, Isil
    Topcuoglu, Haluk Rahmi
    Kandemir, Mahmut
    Tosun, Oguz
    JOURNAL OF SYSTEMS ARCHITECTURE, 2012, 58 (3-4) : 160 - 176
  • [10] Dynamic Energy and Reliability Management in Network-on-Chip based Chip Multiprocessors
    Moghaddam, Milad Ghorbani
    2017 EIGHTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC), 2017,