Lightweight asynchronous scheduling in heterogeneous reconfigurable systems

被引：3

作者：

Rodriguez, Andres ^{[1
]}

Navarro, Angeles ^{[1
]}

Nikov, Kris ^{[2
]}

Nunez-Yanez, Jose ^{[2
]}

Gran, Ruben ^{[3
]}

Gracia, Dario Suarez ^{[3
]}

Asenjo, Rafael ^{[1
]}

机构：

[1] Univ Malaga, Dept Comp Architecture, Malaga, Spain

[2] Univ Bristol, Dept Elect & Elect Engn, Bristol, Avon, England

[3] Univ Zaragoza, Comp Architecture Grp, Zaragoza, Spain

来源：

JOURNAL OF SYSTEMS ARCHITECTURE | 2022年 / 124卷

基金：

英国工程与自然科学研究理事会;

关键词：

Heterogeneous architecture; FPGA; Heterogeneous scheduling; Throughput model; Energy efficiency;

D O I：

10.1016/j.sysarc.2022.102398

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The trend for heterogeneous embedded systems is the integration of accelerators and general-purpose CPU cores on the same die. In these integrated architectures, like the Zynq UltraScale+ board (CPU+FPGA) that we target in this work, hardware support for shared memory and low-overhead synchronization between the accelerator and the CPU cores make the case for exploring strategies that exploit a tight collaboration between the CPUs and the accelerator. In this paper we propose a novel lightweight scheduling strategy, FastFit, targeted to FPGA accelerators, and a new scheduler based on it, named MultiFastFit, which asynchronously tackles heterogeneous systems comprised of a variety of CPU cores and FPGA IPs. Our strategy significantly reduces the overhead to automatically compute the near-optimal chunksizes when compared to a previous state-of-the-art auto-tuned approach, which makes our approach more suitable for fine-grained applications. Additionally, our scheduler MultiFastFit has been designed to enable the efficient co-execution of work among compute devices in such a way that all the devices are busy while minimizing the load unbalance.Our approaches have been evaluated using four benchmarks carefully tuned for the low-power UltraScale+ platform. Our experiments demonstrate that the FastFit strategy always finds the near-optimal FPGA chunksize for any device configuration at a reasonable cost, even for fine-grained and irregular applications, and that heterogeneous CPU+FPGA co-executions that exploit all the compute devices are usually faster and more energy efficient than the CPU-only and FPGA-only executions. We have also compared MultiFastFit with other state-of-the-art scheduling strategies, finding that it outperforms other auto-tuned approach up to 2x and it achieves similar results to manually-tuned schedulers without requiring an offline search of the ideal CPU-FPGA partition or FPGA chunk granularity.

引用

页数：14

共 50 条

[31] Task scheduling for heterogeneous computing systems
Shaikhah AlEbrahim
Imtiaz Ahmad
The Journal of Supercomputing, 2017, 73 : 2313 - 2338
[32] Static scheduling strategies for heterogeneous systems
Beaumont, O
Legrand, A
Robert, Y
COMPUTING AND INFORMATICS, 2002, 21 (04) : 413 - 430
[33] Static scheduling strategies for heterogeneous systems
Beaumont, O
Legrand, A
Robert, Y
PROCEEDINGS OF THE 17TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2003, : 18 - 22
[34] Hierarchical Scheduling in Heterogeneous Grid Systems
Al-Zoubi, Khaldoon
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2007, 2 (01) : 1 - 16
[35] Task scheduling for heterogeneous computing systems
AlEbrahim, Shaikhah
Ahmad, Imtiaz
JOURNAL OF SUPERCOMPUTING, 2017, 73 (06): : 2313 - 2338
[36] A novel task scheduling for heterogeneous systems
Ren, XP
Wan, J
Hu, GH
EMBEDDED SOFTWARE AND SYSTEMS, 2005, 3605 : 400 - 405
[37] Job scheduling in heterogeneous distributed systems
Karatza, HD
JOURNAL OF SYSTEMS AND SOFTWARE, 2001, 56 (03) : 203 - 212
[38] Reconfigurable asynchronous logic
Manohar, Rajit
PROCEEDINGS OF THE IEEE 2006 CUSTOM INTEGRATED CIRCUITS CONFERENCE, 2006, : 13 - 20
[39] Dynamic Approach to Scheduling Reconfigurable Scientific Workflows in Heterogeneous HPC Environments
Cheptsov, Alexey
PROCEEDINGS OF 2016 10TH INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT, AND SOFTWARE INTENSIVE SYSTEMS (CISIS), 2016, : 7 - 14
[40] LUSH: Lightweight Framework for User-level Scheduling in Heterogeneous Multicores
Xu, Vasco Miguel Liang
McShane, Liam White
Mosse, Daniel
2021 IEEE 14TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2021), 2021, : 396 - 404

← 1 2 3 4 5 →