Task mapping and scheduling for network-on-chip based multi-core platform with transient faults

被引:27
作者
Chatterjee, Navonil [1 ]
Paul, Suraj [1 ]
Chattopadhyay, Santanu [1 ]
机构
[1] Indian Inst Technol, Dept Elect & Elect Commun Engn, Kharagpur 721302, WB, India
关键词
Network-on-Chip; Dynamic mapping and scheduling; Energy; Deadline; Fault tolerance; DESIGN; MANAGEMENT; POWER;
D O I
10.1016/j.sysarc.2018.01.002
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Technology scaling has enabled the integration of large number of transistors into a single chip, leading to performance enhancement via incorporation of Processing Elements (PEs), Intellectual Property (IP) cores and Memory Units together on the same platform. On the downside, it has led on-chip components to be more susceptible to faults, both permanent and transient. Permanent faults are predictable in nature and can be dealt with at the time of manufacturing or in field using spares/redundancy. Transient faults also adversely affect the application performance but are unpredictable in nature. Handling transient faults is a challenging task, especially in a real-time system where different applications are executed with various timing constraints. Although significant amount of work has been reported in literature for transient fault management, it lacks addressing the temporal constraint satisfaction of the tasks while restricting the energy expenditure of the system. Existing fault tolerant policies do task replication to ensure higher percentage of deadline satisfaction but at the cost of higher energy consumption. Checkpointing approach can make energy consumption low, however, the number of tasks satisfying their timing constraint also becomes low. Thus a fault tolerant policy which could jointly address the timing and energy constraint in a real time system is desirable. This work proposes an algorithm to intelligently perform a fault-tolerant resource allocation in real-time dynamic scenarios where tasks of applications are not known apriori. The slack times of the incoming tasks have been exploited in the application mapping/scheduling phase of the algorithm, to assign a fault tolerant policy to the corresponding task for mitigating the effect of transient faults. This helps to improve the deadline satisfaction of the task and also reduce the energy consumption. While comparing with existing works, the proposed algorithm achieves 19.8%, 43.5% and 85.8% improvement in deadline satisfaction compared to MXR [1], CPR [2] and TR [3], respectively. On an average, the energy consumption is reduced by 29.1% and 6.7%, compared to AR [4] and MXR [1].
引用
收藏
页码:34 / 56
页数:23
相关论文
共 45 条
  • [11] Eles P, 2008, DES AUT TEST EUROPE, P960
  • [12] Single Event Transients in Digital CMOS-A Review
    Ferlet-Cavrois, Veronique
    Massengill, Lloyd W.
    Gouker, Pascale
    [J]. IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2013, 60 (03) : 1767 - 1790
  • [13] A Novel Bicriteria Scheduling Heuristics Providing a Guaranteed Global System Failure Rate
    Girault, Alain
    Kalla, Hamoudi
    [J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2009, 6 (04) : 241 - 254
  • [14] Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints
    Hu, JC
    Marculescu, R
    [J]. DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2004, : 234 - 239
  • [15] A framework for reliability-aware design exploration on MPSoC based systems
    Huang, Jia
    Raabe, Andreas
    Huang, Kai
    Buckl, Christian
    Knoll, Alois
    [J]. DESIGN AUTOMATION FOR EMBEDDED SYSTEMS, 2012, 16 (04) : 189 - 220
  • [16] Design optimization of time- and cost-constrained fault-tolerant distributed embedded systems
    Izosimov, V
    Pop, P
    Eles, P
    Peng, Z
    [J]. DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 864 - 869
  • [17] Jia Huang, 2011, 2011 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), P247
  • [18] Kahng AB, 2009, DES AUT TEST EUROPE, P423
  • [19] Optimal Checkpoint Selection with Dual-Modular Redundancy Hardening
    Kang, Shin-Haeng
    Park, Hae-woo
    Kim, Sungchan
    Oh, Hyunok
    Ha, Soonhoi
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2015, 64 (07) : 2036 - 2048
  • [20] Static Mapping of Mixed-Critical Applications for Fault-Tolerant MPSoCs
    Kang, Shin-haeng
    Yang, Hoeseok
    Kim, Sungchan
    Bacivarov, Iuliana
    Ha, Soonhoi
    Thiele, Lothar
    [J]. 2014 51ST ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2014,