DeepRT: predictable deep learning inference for cyber-physical systems

被引:8
作者
Kang, Woochul [1 ]
Chung, Jaeyong [2 ]
机构
[1] Incheon Natl Univ, Dept Embedded Syst Engn, Incheon, South Korea
[2] Incheon Natl Univ, Dept Elect Engn, Incheon, South Korea
基金
新加坡国家研究基金会;
关键词
Deep learning; QoS; Real-time; Embedded systems; Energy efficiency; Cyberphysical systems; DVFS;
D O I
10.1007/s11241-018-9314-y
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recently, in mobile and embedded devices, deep learning is changing the way computers see, hear, and understand the world. When deep learning is deployed to such systems, they are supposed to perform inference tasks in a timely and energy-efficient manner. Lots of research has focused on taming deep learning for resource-constrained devices by either compressing deep learning models or devising hardware accelerators. However, these approaches have focused on providing best-effort' performance for such devices. In this paper, we present the design and implementation of DeepRT, a novel deep learning inference runtime. Unlike previous approaches, DeepRT focuses on supporting predictable temporal and spatial inference performance when deep learning models are used under unpredictable and resource-constrained environments. In particular, DeepRT applies formal control theory to support Quality-of-Service (QoS) management that can dynamically minimize the tardiness of inference tasks at runtime while achieving high energy-efficiency. Further, DeepRT determines a proper level of compression of deep learning models at runtime according to the memory availability and users' QoS requirements, resulting in proper trade-offs between the memory savings and the losses of inference accuracy. We evaluate DeepRT on a wide range of deep learning models under various conditions. The experimental results show that DeepRT supports the timeliness of inference tasks in a robust and energy-efficient manner.
引用
收藏
页码:106 / 135
页数:30
相关论文
共 51 条
  • [31] Caffe: Convolutional Architecture for Fast Feature Embedding
    Jia, Yangqing
    Shelhamer, Evan
    Donahue, Jeff
    Karayev, Sergey
    Long, Jonathan
    Girshick, Ross
    Guadarrama, Sergio
    Darrell, Trevor
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 675 - 678
  • [32] Energy-efficient response time management for embedded databases
    Kang, Woochul
    Chung, Jaeyong
    [J]. REAL-TIME SYSTEMS, 2017, 53 (02) : 228 - 253
  • [33] Design, Implementation, and Evaluation of a QoS-Aware Real-Time Embedded Database
    Kang, Woochul
    Son, Sang Hyuk
    Stankovic, John A.
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2012, 61 (01) : 45 - 59
  • [34] Racing and Pacing to Idle: Theoretical and Empirical Analysis of Energy Optimization Heuristics
    Kim, David H. K.
    Imes, Connor
    Hoffmann, Henry
    [J]. 2015 IEEE 3RD INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SYSTEMS, NETWORKS, AND APPLICATIONS CPSNA 2015, 2015, : 78 - 85
  • [35] Kim S., 2016, ARXIV160602147
  • [36] Gradient-based learning applied to document recognition
    Lecun, Y
    Bottou, L
    Bengio, Y
    Haffner, P
    [J]. PROCEEDINGS OF THE IEEE, 1998, 86 (11) : 2278 - 2324
  • [37] SCHEDULING ALGORITHMS FOR MULTIPROGRAMMING IN A HARD-REAL-TIME ENVIRONMENT
    LIU, CL
    LAYLAND, JW
    [J]. JOURNAL OF THE ACM, 1973, 20 (01) : 46 - 61
  • [38] Lu CY, 2003, 9TH IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, P37
  • [39] Feedback control real-time scheduling: Framework, modeling, and algorithms
    Lu, CY
    Stankovic, JA
    Son, SH
    Tao, G
    [J]. REAL-TIME SYSTEMS, 2002, 23 (1-2) : 85 - 126
  • [40] Human-level control through deep reinforcement learning
    Mnih, Volodymyr
    Kavukcuoglu, Koray
    Silver, David
    Rusu, Andrei A.
    Veness, Joel
    Bellemare, Marc G.
    Graves, Alex
    Riedmiller, Martin
    Fidjeland, Andreas K.
    Ostrovski, Georg
    Petersen, Stig
    Beattie, Charles
    Sadik, Amir
    Antonoglou, Ioannis
    King, Helen
    Kumaran, Dharshan
    Wierstra, Daan
    Legg, Shane
    Hassabis, Demis
    [J]. NATURE, 2015, 518 (7540) : 529 - 533