ReTMiC: Reliability-Aware Thermal Management in Multicore Mixed-Criticality Embedded Systems

被引:0
作者
Safari, Sepideh [1 ]
Ansari, Mohsen [1 ,2 ]
Hessabi, Shaahin [2 ]
Henkel, Joerg [3 ]
机构
[1] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran 193955746, Iran
[2] Sharif Univ Technol, Dept Comp Sci & Engn, Tehran 1136511155, Iran
[3] Karlsruhe Inst Technol KIT, Dept Comp Sci, D-76131 Karlsruhe, Germany
关键词
Reliability; Multicore processing; Quality of service; Fault tolerant systems; Fault tolerance; Real-time systems; Timing; Embedded systems; Termination of employment; Temperature; Mixed-criticality systems; multicore platforms; task replication; embedded systems; thermal balancing; REDUNDANCY; DEMAND;
D O I
10.1109/ACCESS.2025.3542472
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the number of cores in multicore platforms increases, temperature constraints may prevent powering all cores simultaneously at maximum voltage and frequency level. Thermal hot spots and unbalanced temperatures between the processing cores may degrade the reliability. This paper introduces a reliability-aware thermal management scheduling (ReTMiC) method for mixed-criticality embedded systems. In this regard, ReTMiC meets Thermal Design Power as the chip-level power constraint at design time. In order to balance the temperature of the processing cores, our proposed method determines balancing points on each frame of the scheduling, and at run time, our proposed lightweight online re-mapping technique is activated at each determined balancing point for balancing the temperature of the processing cores. The online mechanism exploits the proposed temperature-aware factor to reduce the system's temperature based on the current temperature of processing cores and the behavior of their corresponding running tasks. Our experimental results show that the ReTMiC method achieves up to 12.8 degrees C reduction in the chip temperature and 3.5 degrees C reduction in spatial thermal variation in comparison to the state-of-the-art techniques while keeping the system reliability at a required level.
引用
收藏
页码:33157 / 33175
页数:19
相关论文
共 50 条
[41]   Supporting Mode Changes while Providing Hardware Isolation in Mixed-Criticality Multicore Systems [J].
Chisholm, Micaiah ;
Kim, Namhoon ;
Tang, Stephen ;
Otterness, Nathan ;
Anderson, James H. ;
Smith, F. Donelson ;
Porter, Donald E. .
PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON REAL-TIME NETWORKS AND SYSTEMS (RTNS 2017), 2017, :58-67
[42]   Scheduling optimization with partitioning for mixed-criticality systems [J].
Zhou, Yuanbin ;
Samii, Soheil ;
Eles, Petru ;
Peng, Zebo .
JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 98 :191-200
[43]   Criticality-Aware EDF Scheduling for Constrained-Deadline Imprecise Mixed-Criticality Systems [J].
Zhang, Yi-Wen ;
Ma, Jin-Peng ;
Zheng, Hui ;
Gu, Zonghua .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (02) :480-491
[44]   Criticality-driven Design Space Exploration for Mixed-Criticality Heterogeneous Parallel Embedded Systems [J].
Muttillo, Vittoriano ;
Valente, Giacomo ;
Pomante, Luigi .
PARMA-DITAM 2018: 9TH WORKSHOP ON PARALLEL PROGRAMMING AND RUNTIME MANAGEMENT TECHNIQUES FOR MANY-CORE ARCHITECTURES AND 7TH WORKSHOP ON DESIGN TOOLS AND ARCHITECTURES FOR MULTICORE EMBEDDED COMPUTING PLATFORMS, 2018, :63-68
[45]   RASA: Reliability-Aware Scheduling Approach for FPGA-Based Resilient Embedded Systems in Extreme Environments [J].
Saha, Sangeet ;
Zhai, Xiaojun ;
Ehsan, Shoaib ;
Majeed, Shakaiba ;
McDonald-Maier, Klaus .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (06) :3885-3899
[46]   Thermal-Aware Lifetime Reliability in Multicore Systems [J].
Wang, Shengquan ;
Chen, Jian-Jia .
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2010), 2010, :399-405
[47]   Energy- and Reliability-Aware Task Replication in Safety-Critical Embedded Systems [J].
Poursafaei, Farimah ;
Safari, Sepideh ;
Ansari, Mohsen ;
Yeganeh-Khaksar, Amir ;
Salehi, Mohammad ;
Ejlali, Alireza .
2022 CPSSI 4TH INTERNATIONAL SYMPOSIUM ON REAL-TIME AND EMBEDDED SYSTEMS AND TECHNOLOGIES (RTEST 2022), 2022,
[48]   A framework for reliability-aware design exploration on MPSoC based systems [J].
Huang, Jia ;
Raabe, Andreas ;
Huang, Kai ;
Buckl, Christian ;
Knoll, Alois .
DESIGN AUTOMATION FOR EMBEDDED SYSTEMS, 2012, 16 (04) :189-220
[49]   A framework for reliability-aware design exploration on MPSoC based systems [J].
Jia Huang ;
Andreas Raabe ;
Kai Huang ;
Christian Buckl ;
Alois Knoll .
Design Automation for Embedded Systems, 2012, 16 :189-220
[50]   Elastic Scheduling for Graceful Degradation of Mixed-Criticality Systems [J].
Sun, Zhuoran ;
Sudvarg, Marion ;
Gill, Christopher .
2024 32ND INTERNATIONAL CONFERENCE ON REAL-TIME NETWORKS AND SYSTEMS, RTNS 2024, 2024, :218-228