Dynamic Optimizations in GPU using Roofline Model

被引：5

作者：

Thomas, Winnie ^{[1
]}

Toraskar, Suryakant ^{[1
]}

Singh, Virendra ^{[1
]}

机构：

[1] Indian Inst Technol, Comp Architecture & Dependable Syst Lab, Dept Elect Engn, Mumbai, Maharashtra, India

来源：

2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2021年

关键词：

GPU; Roofline; optimizations; scheduling;

D O I：

10.1109/ISCAS51556.2021.9401255

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Massively parallel processors such as graphics processesing units (GPUs) often face the challenge of resource underutilization due to varying resource proclivity of workloads. Running multiple applications on a GPU has been an efficient and known alternative to mitigate underutilization. This paper proposes a multi-application oriented framework that carries out dynamic optimizations based on the operational intensities of various applications. Our framework analyzes applications based on operational intensities to identify their bottleneck resources using Roofline model. We demonstrate that the proposed optimizations improve the utilization and system-wide throughput of the GPU co-running applications with irregular resource demands. The dynamic optimizations improve the performance by 14.8% on average and up to 72.4% over a state-of-the-art spatial multitasking technique.

引用

页数：5

共 21 条

[1]

Adriaens JT, 2012, INT S HIGH PERF COMP, P79

[2] A GRAPHIC PROCESSING UNIT FRAME WORK FOR CONVOLUTIONAL NEURAL NETWORK BASED CLASSIFICATION OF REMOTELY SENSED SATELLITE IMAGES [J].

Ansari, Rizwan Ahmed ;

Thomas, Winnie ;

Buddhiraju, Krishna Mohan .

ISPRS TC V MID-TERM SYMPOSIUM GEOSPATIAL TECHNOLOGY - PIXEL TO PEOPLE, 2018, 4-5 :383-390

[3]

Bakhoda A, 2009, INT SYM PERFORM ANAL, P163, DOI 10.1109/ISPASS.2009.4919648

[4]

Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797

[5]

Jog A., 2015, P 2015 INT S MEM SYS, P223, DOI DOI 10.1145/2818950.2818979

[6]

Jungseob Lee, 2011, Proceedings 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT), P111, DOI 10.1109/PACT.2011.17

[7] GPUS AND THE FUTURE OF PARALLEL COMPUTING [J].

Keckler, Stephen W. ;

Dally, William J. ;

Khailany, Brucek ;

Garland, Michael ;

Glasco, David .

IEEE MICRO, 2011, 31 (05) :7-17

[8] A quantitative roofline model for GPU kernel performance estimation using micro-benchmarks and hardware metric profiling [J].

Konstantinidis, Elias ;

Cotronis, Yiannis .

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2017, 107 :37-56

[9] A practical performance model for compute and memory bound GPU kernels [J].

Konstantinidis, Elias ;

Cotronis, Yiannis .

23RD EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2015), 2015, :651-658

[10] Efficient GPU Spatial-Temporal Multitasking [J].

Liang, Yun ;

Huynh Phung Huynh ;

Rupnow, Kyle ;

Goh, Rick Siow Mong ;

Chen, Deming .

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (03) :748-760

← 1 2 3 →