EvoLP: Self-Evolving Latency Predictor for Model Compression in Real-Time Edge Systems

被引:0
|
作者
Huai, Shuo [1 ,2 ]
Kong, Hao [1 ,2 ]
Li, Shiqing [2 ]
Luo, Xiangzhong [2 ]
Subramaniam, Ravi [3 ]
Makaya, Christian [4 ]
Lin, Qian [4 ]
Liu, Weichen [2 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[2] Nanyang Technol Univ, HP NTU Digital Mfg Corp Lab, Singapore, Singapore
[3] Innovat & Experiences Business Personal Syst HP In, Palo Alto, CA 94304 USA
[4] HP Inc, HP Appl AI, Palo Alto, CA 94304 USA
关键词
Predictive models; Training; Hardware; Runtime; Analytical models; Table lookup; Data models; Deep learning; neural networks; prediction methods; real-time systems;
D O I
10.1109/LES.2023.3321599
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Edge devices are increasingly utilized for deploying deep learning applications on embedded systems. The real-time nature of many applications and the limited resources of edge devices necessitate latency-targeted neural network compression. However, measuring latency on real devices is challenging and expensive. Therefore, this letter presents a novel and efficient framework, named EvoLP, to accurately predict the inference latency of models on edge devices. This predictor can evolve to achieve higher latency prediction precision during the network compression process. Experimental results demonstrate that EvoLP outperforms previous state-of-the-art approaches by being evaluated on three edge devices and four model variants. Moreover, when incorporated into a model compression framework, it effectively guides the compression process for higher model accuracy while satisfying strict latency constraints. We open-source EvoLP at https://github.com/ntuliuteam/EvoLP.
引用
收藏
页码:174 / 177
页数:4
相关论文
共 50 条
  • [1] A rigorous approach for constructing self-evolving real-time reactive systems
    Alagar, VS
    Achuthan, R
    Haydar, M
    Muthiayen, D
    Ormandjieva, O
    Zheng, M
    INFORMATION AND SOFTWARE TECHNOLOGY, 2003, 45 (11) : 743 - 761
  • [2] Organic Real-time Programming - Vision and Approaches towards Self-Evolving and Adaptive Real-time Software
    Rammig, Franz-Josef
    Khaluf, Lial
    Montealegre, Norma
    Stahl, Katharina
    Zhao, Yuhong
    2013 IEEE 16TH INTERNATIONAL SYMPOSIUM ON OBJECT/COMPONENT/SERVICE-ORIENTED REAL-TIME DISTRIBUTED COMPUTING (ISORC), 2013,
  • [3] Real-Time Event Detection Using Self-Evolving Contextual Analysis (SECA) Approach
    Al Sulaimani, Sami
    Starkey, Andrew
    IEEE ACCESS, 2023, 11 : 127011 - 127034
  • [4] Efficient Real-Time Path Planning with Self-Evolving Particle Swarm Optimization in Dynamic Scenarios
    Xin, Jinghao
    Li, Zhi
    Zhang, Yang
    Li, Ning
    UNMANNED SYSTEMS, 2024, 12 (02) : 215 - 226
  • [5] Evolving dependable real-time systems
    Sha, L
    Rajkumar, R
    Gagliardi, M
    1996 IEEE AEROSPACE APPLICATIONS CONFERENCE, PROCEEDINGS, VOL 1, 1996, : 335 - 346
  • [6] PRECISION OF LATENCY MEASURES ON REAL-TIME COMPUTING SYSTEMS
    CHRISTIAN, TW
    POLSON, PG
    BEHAVIOR RESEARCH METHODS & INSTRUMENTATION, 1975, 7 (02): : 175 - 178
  • [7] In situ Latency Monitoring for Heterogeneous Real-time Systems
    Geier, Martin
    Burghart, Tobias
    Hackl, Martin
    Chakraborty, Samarjit
    2019 32ND INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2019 18TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID), 2019, : 275 - 280
  • [8] Bounded transmission latency in real-time edge computing: a scheduling analysis
    Fara, Pietro
    Serra, Gabriele
    Aromolo, Federico
    2023 26TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN, DSD 2023, 2023, : 618 - 625
  • [9] A MODEL FOR REAL-TIME SYSTEMS
    KRISHNAN, P
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 520 : 298 - 307
  • [10] Estimating latency and concurrency of Asynchronous Real-Time Interactive Systems using Model Checking
    Rehfeld, Stephan
    Latoschik, Marc Erich
    Tramberend, Henrik
    2016 IEEE VIRTUAL REALITY CONFERENCE (VR), 2016, : 57 - 66