Dynamic Optimizations in GPU using Roofline Model

被引:3
作者
Thomas, Winnie [1 ]
Toraskar, Suryakant [1 ]
Singh, Virendra [1 ]
机构
[1] Indian Inst Technol, Comp Architecture & Dependable Syst Lab, Dept Elect Engn, Mumbai, Maharashtra, India
来源
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2021年
关键词
GPU; Roofline; optimizations; scheduling;
D O I
10.1109/ISCAS51556.2021.9401255
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Massively parallel processors such as graphics processesing units (GPUs) often face the challenge of resource underutilization due to varying resource proclivity of workloads. Running multiple applications on a GPU has been an efficient and known alternative to mitigate underutilization. This paper proposes a multi-application oriented framework that carries out dynamic optimizations based on the operational intensities of various applications. Our framework analyzes applications based on operational intensities to identify their bottleneck resources using Roofline model. We demonstrate that the proposed optimizations improve the utilization and system-wide throughput of the GPU co-running applications with irregular resource demands. The dynamic optimizations improve the performance by 14.8% on average and up to 72.4% over a state-of-the-art spatial multitasking technique.
引用
收藏
页数:5
相关论文
共 21 条
  • [1] Adriaens JT, 2012, INT S HIGH PERF COMP, P79
  • [2] A GRAPHIC PROCESSING UNIT FRAME WORK FOR CONVOLUTIONAL NEURAL NETWORK BASED CLASSIFICATION OF REMOTELY SENSED SATELLITE IMAGES
    Ansari, Rizwan Ahmed
    Thomas, Winnie
    Buddhiraju, Krishna Mohan
    [J]. ISPRS TC V MID-TERM SYMPOSIUM GEOSPATIAL TECHNOLOGY - PIXEL TO PEOPLE, 2018, 4-5 : 383 - 390
  • [3] Bakhoda A, 2009, INT SYM PERFORM ANAL, P163, DOI 10.1109/ISPASS.2009.4919648
  • [4] Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797
  • [5] Jog Adwait., 2015, P 2015 INT S MEMORY, P223
  • [6] Jungseob Lee, 2011, Proceedings 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT), P111, DOI 10.1109/PACT.2011.17
  • [7] GPUS AND THE FUTURE OF PARALLEL COMPUTING
    Keckler, Stephen W.
    Dally, William J.
    Khailany, Brucek
    Garland, Michael
    Glasco, David
    [J]. IEEE MICRO, 2011, 31 (05) : 7 - 17
  • [8] A quantitative roofline model for GPU kernel performance estimation using micro-benchmarks and hardware metric profiling
    Konstantinidis, Elias
    Cotronis, Yiannis
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2017, 107 : 37 - 56
  • [9] A practical performance model for compute and memory bound GPU kernels
    Konstantinidis, Elias
    Cotronis, Yiannis
    [J]. 23RD EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2015), 2015, : 651 - 658
  • [10] Efficient GPU Spatial-Temporal Multitasking
    Liang, Yun
    Huynh Phung Huynh
    Rupnow, Kyle
    Goh, Rick Siow Mong
    Chen, Deming
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (03) : 748 - 760