CARSS: Client-Aware Resource Sharing and Scheduling for Heterogeneous Applications

被引：3

作者：

Baek, Iljoo ^{[1
]}

Harding, Matthew ^{[1
]}

Kanda, Akshit ^{[1
]}

Choi, Kyung Ryeol ^{[2
]}

Samii, Soheil ^{[3
,4
]}

Rajkumar, Ragunathan ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[2] George Washington Univ, Washington, DC 20052 USA

[3] Gen Motors R&D, Warren, MI USA

[4] Linkoping Univ, Linkoping, Sweden

来源：

2020 IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS 2020) | 2020年

关键词：

resource sharing; gpu; heterogeneous; hardware accelerators; soft real-time;

D O I：

10.1109/RTAS48715.2020.00008

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Modern hardware accelerators such as GP-GPUs and DSPs are commonly being used in real-time settings such as high-performance multimedia systems and autonomous vehicles. In fact, the throughput of a wide variety of computationally demanding tasks from 3D graphics and rendering to image processing and deep learning can benefit from such specialized hardware. Such heterogeneity can affect the performance of applications running simultaneously on the same accelerator. Prior studies on resource sharing and scheduling on hardware accelerators have not attempted to account for this context. In this work, we provide a portable tagging-based cooperative scheduler and resource monitor for use by heterogeneous applications sharing a single hardware accelerator in a soft real-time environment. We also offer practical insight into how various types of applications use the hardware accelerators differently. We substantiate the feasibility of our approach and evaluate the improvement of various scheduling policies over a proprietary scheduler in several case-studies with real-world applications on 2 NVIDIA platforms: a GeForce GTX 1070 GPU and an Xavier embedded platform(1). Although we focus on GPUs in this paper, our underlying observations and framework can also be used for sharing execution on other types of hardware accelerators.

引用

页码：324 / 335

页数：12

共 20 条

[1] Abadi M.M., 2015, TENSORFLOW LARGE SCA, DOI 10.5555/3026877.3026899
[2] The FMLP+: An Asymptotically Optimal Real-Time Locking Protocol for Suspension-Aware Analysis
Brandenburg, Bjoern B.
[J]. 2014 26TH EUROMICRO CONFERENCE ON REAL-TIME SYSTEMS (ECRTS 2014), 2014, : 61 - 71
[3] Deadline-based Scheduling for GPU with Preemption Support
Capodieci, Nicola
Cavicchioli, Roberto
Bertogna, Marko
Paramakuru, Aingara
[J]. 2018 39TH IEEE REAL-TIME SYSTEMS SYMPOSIUM (RTSS 2018), 2018, : 119 - 130
[4] Efficient implementation of Genetic Algorithms on GP-GPU with scheduled persistent CUDA threads
Capodieci, Nicola
Burgio, Paolo
[J]. 2015 SEVENTH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP), 2015, : 6 - 12
[5] Histograms of oriented gradients for human detection
Dalal, N
Triggs, B
[J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
[6] Duato Jose, 2010, 2010 International Conference on High Performance Computing & Simulation (HPCS 2010), P224, DOI 10.1109/HPCS.2010.5547126
[7] Elliott Glenn A., 2013, P RTSS
[8] Farhadi, 2018, COMPUTER VISION PATT
[9] Howard AG, 2017, ARXIVABS170404861 CO
[10] Fractional GPUs: Software-based Compute and Memory Bandwidth Reservation for GPUs
Jain, Saksham
Baek, Iljoo
Wangt, Shige
Rajkumar, Ragunathan
[J]. 25TH IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS 2019), 2019, : 29 - 41

← 1 2 →