Error-Resilient Design Techniques for Reliable and Dependable Computing

被引:14
作者
Das, Shidhartha [1 ]
Bull, David M. [1 ]
Whatmough, Paul N. [1 ]
机构
[1] ARM Ltd, Cambridge CB1 9NJ, England
关键词
Error-resilient computing; variation mitigation; energy-efficient digital design; POWER; TOLERANCE; SYSTEM;
D O I
10.1109/TDMR.2015.2389038
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Integrated circuits in modern systems-on-chip and microprocessors are typically operated with sufficient timing margins to mitigate the impact of rising process, voltage, and temperature (PVT) variations at advanced process nodes. The widening margins required for ensuring robust computation inevitably lead to conservative designs with unacceptable energy-efficiency overheads. Reconciling the conflicting objectives imposed by variation mitigation and energy-efficient computing will require fundamental departures from conventional circuit and system design practices. This paper posits error-resilient general-purpose computing as an effective approach for achieving this. We review resilient techniques that exploit tolerance to timing errors to automatically compensate for variations and dynamically tune a system to its most efficient operating point. We present the Razor approach as a pioneering example of such a technique. We present silicon measurement results from multiple industrial and academic demonstration systems that employ Razor dynamic voltage and frequency management. In particular, we highlight the application of Razor to two specific platforms. The first is an ARM-based industrial prototype where Razor dynamic adaptation leads to 52% energy savings at 1 GHz operation. The second platform applies Razor for robust operation in the presence of radiation-induced Single Event Upsets. These efforts clearly demonstrate how energy-efficient compute engines can be designed by combining timing-error resiliency with optimizations across algorithms, circuits, and microarchitecture boundaries.
引用
收藏
页码:24 / 34
页数:11
相关论文
共 21 条
[1]   A 45 nm Resilient Microprocessor Core for Dynamic Variation Tolerance [J].
Bowman, Keith A. ;
Tschanz, James W. ;
Lu, Shih-Lien L. ;
Aseron, Paolo A. ;
Khellah, Muhammad M. ;
Raychowdhury, Arijit ;
Geuskens, Bibiche M. ;
Tokunaga, Carlos ;
Wilkerson, Chris B. ;
Karnik, Tanay ;
De, Vivek K. .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2011, 46 (01) :194-208
[2]   A Power-Efficient 32 bit ARM Processor Using Timing-Error Detection and Correction for Transient-Error Tolerance and Adaptation to PVT Variation [J].
Bull, David ;
Das, Shidhartha ;
Shivashankar, Karthik ;
Dasika, Ganesh S. ;
Flautner, Krisztian ;
Blaauw, David .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2011, 46 (01) :18-31
[3]   A self-tuning DVS processor using delay-error detection and correction [J].
Das, S ;
Roberts, D ;
Lee, S ;
Pant, S ;
Blaauw, D ;
Austin, T ;
Flautner, K ;
Mudge, T .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2006, 41 (04) :792-804
[4]  
Das S., 2013, P IEEE CUST INT CIRC, P2290
[5]   Adaptive Design for Nanometer Technology [J].
Das, Shidhartha ;
Blaauw, David .
ISCAS: 2009 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-5, 2009, :77-+
[6]   RazorII: In Situ Error Detection and Correction for PVT and SER Tolerance [J].
Das, Shidhartha ;
Tokunaga, Carlos ;
Pant, Sanjay ;
Ma, Wei-Hsiang ;
Kalaiselvan, Sudherssen ;
Lai, Kevin ;
Bull, David M. ;
Blaauw, David T. .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2009, 44 (01) :32-48
[7]  
Drake A., 2007, 2007 IEEE International Solid-State Circuits Conference (IEEE Cat. No.07CH37858), P398, DOI 10.1109/ISSCC.2007.373462
[8]   Razor: Circuit-level correction of timing errors for low-power operation [J].
Ernst, D ;
Das, S ;
Lee, S ;
Blaauw, D ;
Austin, T ;
Mudge, T ;
Kim, NS ;
Flautner, K .
IEEE MICRO, 2004, 24 (06) :10-20
[9]   DARK SILICON AND THE END OF MULTICORE SCALING [J].
Esmaeilzadeh, Hadi ;
Blem, Emily ;
St Amant, Renee ;
Sankaralingam, Karthikeyan ;
Burger, Doug .
IEEE MICRO, 2012, 32 (03) :122-134
[10]   A 90-nm variable frequency clock system for a power-managed Itanium Architecture processor [J].
Fischer, T ;
Desai, J ;
Doyle, B ;
Naffziger, S ;
Patella, B .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2006, 41 (01) :218-228