Binary Search-Based Fast Scheduling Algorithms for Reliability-Aware Energy-Efficient Task Graph Scheduling With Fault Tolerance

被引:1
作者
Biswas, Sajib K. [1 ]
Muhuri, Pranab K. [1 ]
Roy, Uttam K. [1 ]
机构
[1] South Asian Univ, Dept Comp Sci, Delhi 110068, India
来源
IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING | 2024年 / 9卷 / 03期
关键词
Reliability; Task analysis; Schedules; Energy consumption; Fault tolerant systems; Fault tolerance; Energy efficiency; Directed acyclic graph; dynamic voltage and frequency scaling; energy-efficient scheduling; fault tolerance; greedy binary search; heterogeneous computing systems; reliability; search-space; RELIABLE PARALLEL APPLICATIONS; PRECEDENCE CONSTRAINED TASKS; SYSTEMS; OPTIMIZATION; CONSUMPTION;
D O I
10.1109/TSUSC.2023.3295939
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Among the available processor-level energy savings schemes, dynamic voltage and frequency scaling (DVFS) is very popular and effective due to its widespread cross-platform use in designing energy-efficient scheduling algorithms. However, rapid frequency switching by DVFS based algorithms while minimizing the energy consumptions may result transient failures in the system. To avoid such failures and their catastrophic consequences, energy-efficient scheduling algorithms with the capabilities to provide more reliable task schedules are always in demand. Therefore, this paper introduces two novel low complexity energy-efficient task scheduling algorithms for heterogeneous computing environments. We term the first algorithm as 'binary search-based energy-efficient scheduling with reliability goal (BSESRG)' for running parallel task graphs in heterogeneous computing systems. We show that the proposed BSESRG has the capability to reduce energy consumption, and shorten the total schedule length by meeting the reliability goals up to a certain threshold. Then, we present our second algorithm, the 'binary search-based energy-efficient fault-tolerant scheduling with reliability goal (BSESRG-FT), which ensures meeting the reliability goals with simultaneous consideration of fault tolerance. The proposed BSESRG-FT is able to reach higher reliability goals, reduce energy consumption, and shorten the total schedule length of a parallel task graph on heterogeneous platforms. We demonstrate the working of both BSESRG and BSESRG-FT through simulation experiments considering real-world task graphs, and show the supremacy of the two proposed algorithms over their respective peers (viz., ESRG and EFSRG) in terms of energy savings, schedule lengths, run times and reliability goals. The superiority of the proposed BSESRG and BSESRG-FT over their respective competitors are also validated on the real benchmark MiBench. Moreover, from the complexity analysis, we respectively find the time complexities of BSESRG and BSESRG-FT as O(|X| x |P| x log(2)|F|) and O (|X| x |P|(2 )x log(2)|F|) confirming their better computational efficiency than the respective peers.
引用
收藏
页码:433 / 451
页数:19
相关论文
共 49 条
  • [21] Qingjia Huang, 2012, Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2012), P781, DOI 10.1109/CCGrid.2012.49
  • [22] Task Scheduling for Energy Consumption Constrained Parallel Applications on Heterogeneous Computing Systems
    Quan, Zhe
    Wang, Zhi-Jie
    Ye, Ting
    Guo, Song
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (05) : 1165 - 1182
  • [23] READY: Reliability- and Deadline-Aware Power-Budgeting for Heterogeneous Multicore Systems
    Saber-Latibari, Javad
    Ansari, Mohsen
    Gohari-Nazari, Pourya
    Yari-Karin, Sina
    Monazzah, Amir Mahdi Hosseini
    Ejlali, Alireza
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2021, 40 (04) : 646 - 654
  • [24] A Survey of Fault-Tolerance Techniques for Embedded Systems From the Perspective of Power, Energy, and Thermal Issues
    Safari, Sepideh
    Ansari, Mohsen
    Khdr, Heba
    Gohari-Nazari, Pourya
    Yari-Karin, Sina
    Yeganeh-Khaksar, Amir
    Hessabi, Shaahin
    Ejlali, Alireza
    Henkel, Jorg
    [J]. IEEE ACCESS, 2022, 10 : 12229 - 12251
  • [25] TherMa-MiCs: Thermal-Aware Scheduling for Fault-Tolerant Mixed-Criticality Systems
    Safari, Sepideh
    Khdr, Heba
    Gohari-Nazari, Pourya
    Ansari, Mohsen
    Hessabi, Shaahin
    Henkel, Joerg
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (07) : 1678 - 1694
  • [26] On the Scheduling of Energy-Aware Fault-Tolerant Mixed-Criticality Multicore Systems with Service Guarantee Exploration
    Safari, Sepideh
    Ansari, Mohsen
    Ershadi, Ghazal
    Hessabi, Shaahin
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (10) : 2338 - 2354
  • [27] MODELS AND ALGORITHMS FOR RELIABILITY-ORIENTED TASK-ALLOCATION IN REDUNDANT DISTRIBUTED-COMPUTER SYSTEMS
    SHATZ, SM
    WANG, JP
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 1989, 38 (01) : 16 - 27
  • [28] An Energy-Efficient Task Scheduling Algorithm in DVFS-enabled Cloud Environment
    Tang, Zhuo
    Qi, Ling
    Cheng, Zhenzhen
    Li, Kenli
    Khan, Samee U.
    Li, Keqin
    [J]. JOURNAL OF GRID COMPUTING, 2016, 14 (01) : 55 - 74
  • [29] A Graph-Based Fault-Tolerant Approach to Modeling QoS for IoT-Based Surveillance Applications
    Thomas, Diya
    Orgun, Mehmet
    Hitchens, Michael
    Shankaran, Rajan
    Mukhopadhyay, Subhas Chandra
    Ni, Wei
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (05) : 3587 - 3604
  • [30] Thulasiraman M. N. S., 1992, Theory and Algorithms