DREAM: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads

被引:1
作者
Kim, Seah [1 ]
Kwon, Hyoukjun [2 ,3 ]
Song, Jinook [3 ]
Jo, Jihyuck [3 ]
Chen, Yu-Hsin [3 ]
Lai, Liangzhen [3 ]
Chandra, Vikas [3 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA USA
[2] UC Irvine, Irvine, CA 92697 USA
[3] Meta, Sunnyvale, CA 94089 USA
来源
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, ASPLOS 2023, VOL 4 | 2023年
关键词
Scheduler; AR/VR; Multi-model ML; Hardware-Software Co-Design; ALGORITHM; PRECEDENCE; DEADLINES; TASKS;
D O I
10.1145/3623278.3624753
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Emerging real-time multi-model ML (RTMM) workloads such as AR/VR and drone control involve dynamic behaviors in various granularity; task, model, and layers within a model. Such dynamic behaviors introduce new challenges to the system software in an ML system since the overall system load is not completely predictable, unlike traditional ML workloads. In addition, RTMM workloads require real-time processing, involve highly heterogeneous models, and target resource-constrained devices. Under such circumstances, developing an effective scheduler gains more importance to better utilize underlying hardware considering the unique characteristics of RTMM workloads. Therefore, we propose a new scheduler, DREAM, which effectively handles various dynamicity in RTMM workloads targeting multi-accelerator systems. DREAM quantifies the unique requirements for RTMM workloads and utilizes the quantified scores to drive scheduling decisions, considering the current system load and other inference jobs on different models and input frames. DREAM utilizes tunable parameters that provide fast and effective adaptivity to dynamic workload changes. In our evaluation of five scenarios of RTMM workload, DREAM reduces the overall UXCost, which is an equivalent metric of the energy-delay product (EDP) for RTMM defined in the paper, by 32.2% and 50.0% in the geometric mean (up to 80.8% and 97.6%) compared to state-of-the-art baselines, which shows the efficacy of our scheduling methodology.
引用
收藏
页码:73 / 86
页数:14
相关论文
共 51 条
  • [1] Combined task and message scheduling in distributed real-time systems
    Abdelzaher, TF
    Shin, KG
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1999, 10 (11) : 1179 - 1191
  • [2] Combining (mn)-hard deadlines and dual priority scheduling
    Bernat, G
    Burns, A
    [J]. 18TH IEEE REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 1997, : 46 - 57
  • [3] NEW STRATEGIES FOR ASSIGNING REAL-TIME TASKS TO MULTIPROCESSOR SYSTEMS
    BURCHARD, A
    LIEBEHERR, J
    OH, YF
    SON, SH
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (12) : 1429 - 1442
  • [4] Cai Han, 2020, P ICLR
  • [5] An optimal real-time scheduling algorithm for multiprocessors
    Cho, Hyeonjoong
    Ravindran, Binoy
    Jensen, E. Douglas
    [J]. 27TH IEEE INTERNATIONAL REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 2006, : 101 - +
  • [6] PREMA: A Predictive Multi-task Scheduling Algorithm For Preemptible Neural Processing Units
    Choi, Yujeong
    Rhu, Minsoo
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 220 - 233
  • [7] ShiDianNao: Shifting Vision Processing Closer to the Sensor
    Du, Zidong
    Fasthuber, Robert
    Chen, Tianshi
    Ienne, Paolo
    Li, Ling
    Luo, Tao
    Feng, Xiaobing
    Chen, Yunji
    Temam, Olivier
    [J]. 2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2015, : 92 - 104
  • [8] Planaria: Dynamic Architecture Fission for Spatial Multi-Tenant Acceleration of Deep Neural Networks
    Ghodrati, Soroush
    Ahn, Byung Hoon
    Kim, Joon Kyung
    Kinzer, Sean
    Yatham, Brahmendra Reddy
    Alla, Navateja
    Sharma, Hardik
    Alian, Mohammad
    Ebrahimi, Eiman
    Kim, Nam Sung
    Young, Cliff
    Esmaeilzadeh, Hadi
    [J]. 2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020), 2020, : 681 - 697
  • [9] Gujarati A, 2020, PROCEEDINGS OF THE 14TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '20), P443
  • [10] Learning Depth From Single Images With Deep Neural Network Embedding Focal Length
    He, Lei
    Wang, Guanghui
    Hu, Zhanyi
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (09) : 4676 - 4689