GPU Architecture Aware Instruction Scheduling for Improving Soft-Error Reliability

被引:5
|
作者
Lee H. [1 ]
Al Faruque M.A. [1 ]
机构
[1] Department of Electrical Engineering and Computer Science, University of California, Irvine, 92697, CA
关键词
compiler; GPGPU; instruction scheduling; reliability; soft-error;
D O I
10.1109/TMSCS.2017.2667661
中图分类号
学科分类号
摘要
The demand for low-power and high-performance computing has been driving the semiconductor industry for decades. The semiconductor technology has been scaled down to satisfy these demands. At the same time, the semiconductor technology has faced severe reliability challenges like soft-error. Research has been conducted to improve the soft-error reliability of the GPU, which has been improved by using various methodologies such as redundancy methodologies. However, the GPU compiler has yet to be considered for improving the soft-error reliability of the GPU. In this paper, in order to improve the soft-error reliability of the GPU, we propose a novel GPU architecture aware compilation methodology. The proposed methodology jointly considers the parallel behavior of the GPU hardware and the applications, and minimizes the vulnerability of the GPU applications during instruction scheduling. In addition, the proposed methodology is able to complement any hardware based soft-error reliability improvement techniques. We compared our compilation methodology with the state-of-the-art soft-error reliability aware techniques and the performance aware instruction scheduling. We have injected the soft-errors during the experiments and have compared the number of correct executions that have no erroneous output. Our methodology requires less performance and power overhead than the state-of-the-art soft-error reliability methodologies in most cases. Compilation time overhead of our methodology is 8.13 seconds on average. The experimental results show that our methodology improves the soft-error reliability by 23 percent and 12 percent (up to 64 percent and 52 percent) compared to the state-of-the-art soft-error reliability and performance aware compilation techniques, respectively. Moreover, we have shown that the soft-error reliability of a GPU is not related to the performance, but to the fine-grained timing behavior of an application. © 2015 IEEE.
引用
收藏
页码:86 / 99
页数:13
相关论文
共 50 条
  • [31] CSER: HW/SW Configurable Soft-Error Resiliency for Application Specific Instruction-Set Processors
    Li, Tuo
    Shafique, Muhammad
    Rehman, Semeen
    Radhakrishnan, Swarnalatha
    Ragel, Roshan
    Ambrose, Jude Angelo
    Henkel, Joerg
    Parameswaran, Sri
    DESIGN, AUTOMATION & TEST IN EUROPE, 2013, : 707 - 712
  • [32] On Antagonism Between Side-Channel Security and Soft-Error Reliability in BNN Inference Engines
    Lai, Xinhui
    Lange, Thomas
    Balakrishnan, Aneesh
    Alexandrescu, Dan
    Jenihhin, Maksim
    PROCEEDINGS OF THE 2021 IFIP/IEEE INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2021, : 84 - 89
  • [33] Reliability-driven pin assignment optimization to improve in-orbit soft-error rate
    Aguiar, Y. Q.
    Wrobel, F.
    Autran, J. -L.
    Leroux, P.
    Saigne, F.
    Pouget, V.
    Touboul, A. D.
    MICROELECTRONICS RELIABILITY, 2020, 114 (114)
  • [34] Soft-Error Reliability and Power Co-Optimization for GPGPUs Register File using Resistive Memory
    Tan, Jingweijia
    Li, Zhi
    Fu, Xin
    2015 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2015, : 369 - 374
  • [35] SEVA A soft-error-and variation-aware cache architecture
    Hung, Luong D.
    Goshima, Masahiro
    Sakai, Shuichi
    12TH PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING, PROCEEDINGS, 2006, : 47 - +
  • [36] Declarative Resilience: A Holistic Soft-Error Resilient Multicore Architecture that Trades off Program Accuracy for Efficiency
    Omar, Hamza
    Shi, Qingchuan
    Ahmad, Masab
    Dogan, Halit
    Khan, Omer
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2018, 17 (04)
  • [37] Optimizing Soft Error Reliability Through Scheduling on Heterogeneous Multicore Processors
    Naithani, Ajeya
    Eyerman, Stijn
    Eeckhout, Lieven
    IEEE TRANSACTIONS ON COMPUTERS, 2018, 67 (06) : 830 - 846
  • [38] Combining Architectural Fault-injection and Neutron Beam Testing Approaches Toward Better Understanding of GPU Soft-error Resilience
    Previlon, Fritz G.
    Egbantan, Babatunde
    Tiwari, Devesh
    Rech, Paolo
    Kaeli, David. R.
    2017 IEEE 60TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2017, : 898 - 901
  • [39] Managing Multi-Core Soft-Error Reliability Through Utility-driven Cross Domain optimization
    Zhang, Wangyuan
    Li, Tao
    2008 INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, 2008, : 132 - 137
  • [40] Reliability Assessment and Quantitative Evaluation of Soft-Error Resilient 3D Network-on-Chip Systems
    Dang, Khanh N.
    Meyer, Michael
    Okuyama, Yuichi
    Ben Abdallah, Abderazek
    2016 IEEE 25TH ASIAN TEST SYMPOSIUM (ATS), 2016, : 161 - 166