SMQoS: Improving Utilization and Energy Efficiency with QoS Awareness on GPUs

被引:9
作者
Sun, Qingxiao [1 ]
Liu, Yi [1 ]
Yang, Hailong [1 ]
Luan, Zhongzhi [1 ]
Qian, Depei [1 ]
机构
[1] Beihang Univ, Sino German Joint Software Inst, Sch Comp Sci & Engn, Beijing, Peoples R China
来源
2019 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER) | 2019年
基金
中国国家自然科学基金;
关键词
Graphics processing units; Quality of Service; Dynamic resource management; Throughput; Power efficiency; MULTITASKING;
D O I
10.1109/cluster.2019.8891047
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Meeting the Quality of Service (QoS) requirement under task consolidation on the GPU is extremely challenging. Previous work mostly relies on static task or resource scheduling and cannot handle the QoS violation during runtime. In addition, the existing work fails to exploit the computing characteristics of batch tasks, and thus wastes the opportunities to reduce power consumption while improving GPU utilization. To address the above problems, we propose a new runtime mechanism SMQoS that can dynamically adjust the resource allocation during runtime to satisfy the QoS of latency-sensitive tasks and determine the optimal resource allocation for batch tasks to improve GPU utilization and power efficiency. The experimental results show that with SMQoS, 2.27% and 7.58% more task co-runnings reach the 95% QoS target than Spart and Rollover respectively. In addition, SMQoS achieves 23.9% and 32.3% higher throughput, and reduces the power consumption by 25.7% and 10.1%, compared to Spart and Rollover respectively.
引用
收藏
页码:362 / 366
页数:5
相关论文
共 32 条
  • [1] Adriaens JT, 2012, INT S HIGH PERF COMP, P79
  • [2] Aguilera P, 2014, ASIA S PACIF DES AUT, P726, DOI 10.1109/ASPDAC.2014.6742976
  • [3] [Anonymous], 2013, ACM SIGARCH Computer Architecture News, DOI DOI 10.1145/2508148.2485964
  • [4] [Anonymous], NVIDIAS NEXT GENERAT
  • [5] SLOOP: QoS-Supervised Loop Execution to Reduce Energy on Heterogeneous Architectures
    Azhar, M. Waqar
    Stenstrom, Per
    Papaefstathiou, Vassilis
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (04)
  • [6] Bakhoda A, 2009, INT SYM PERFORM ANAL, P163, DOI 10.1109/ISPASS.2009.4919648
  • [7] Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797
  • [8] Baymax: QoS Awareness and Increased Utilization for Non-Preemptive Accelerators in Warehouse Scale Computers
    Chen, Quan
    Yang, Hailong
    Mars, Jason
    Tang, Lingjia
    [J]. ACM SIGPLAN NOTICES, 2016, 51 (04) : 681 - 696
  • [9] Grauer-Gray, 2012, AUTOTUNING HIGH LEVE, P1, DOI [DOI 10.1109/INPAR.2012.6339595, 10.1 109/InPar.2012.6339595]
  • [10] Hong S, 2010, CONF PROC INT SYMP C, P280, DOI 10.1145/1816038.1815998