Two-Level Scheduling Algorithms for Deep Neural Network Inference in Vehicular Networks

被引:3
作者
Wu, Yalan [1 ,2 ]
Wu, Jigang [3 ]
Yao, Mianyang [3 ]
Liu, Bosheng [3 ]
Chen, Long [3 ]
Lam, Siew Kei [2 ]
机构
[1] Guangdong Univ Technol, Sch Integrated Circuits, Guangzhou 510006, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[3] Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Vehicular network; two-level task scheduling; DNN inference; quality of computing services; accelerator; RESOURCE-ALLOCATION; EDGE; ACCELERATION;
D O I
10.1109/TITS.2023.3266795
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
In vehicular networks, task scheduling at the microarchitecture-level and network-level offers tremendous potential to improve the quality of computing services for deep neural network (DNN) inference. However, existing task scheduling works only focus on either one of the two levels, which results in inefficient utilization of computing resources. This paper aims to fill this gap by formulating a two-level scheduling problem for DNN inference tasks in a vehicular network, with an objective of minimizing total weighted sum of response time and energy consumption for all tasks under the following constraints: per task response time, per vehicle energy consumption, per vehicle storage capacity. We first formulate the problem and prove that it is NP-hard. A group transformation based algorithm, called GTA, is proposed. GTA makes scheduling decisions at the network-level using the group transformation based approach, and at the microarchitecture-level using a greedy strategy. In addition, an algorithm, denoted as DRL, is proposed to decrease total weighted sum of response time and energy consumption for all tasks. DRL trains two models with deep reinforcement learning to achieve two-level scheduling. The proposed algorithms are evaluated on a platform consisting of a desktop, Raspberry Pi, Eyeriss, OSM, SUMO, NS-3. Simulation results show that DRL outperforms the state-of-the-art methods for all cases, while the proposed GTA outperforms the state-ofthe-art methods for most cases, in terms of total weighted sum of response time and energy consumption. Compared with four baseline algorithms, GTA and DRL reduce the total weighted sum of response time and energy consumption by 41.49% and 62.38%, on average respectively, for different numbers of tasks.
引用
收藏
页码:9324 / 9343
页数:20
相关论文
共 50 条
  • [21] PIE: A Pipeline Energy-efficient Accelerator for Inference Process in Deep Neural Networks
    Zhao, Yangyang
    Yu, Qi
    Zhou, Xuda
    Zhou, Xuehai
    Wang, Chao
    Li, Xi
    [J]. 2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 1067 - 1074
  • [22] TRIM: A Design Space Exploration Model for Deep Neural Networks Inference and Training Accelerators
    Qi, Yangjie
    Zhang, Shuo
    Taha, Tarek M.
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (05) : 1648 - 1661
  • [23] Joint backhaul bandwidth and power allocation in heterogeneous cellular networks: A two-level approach
    Zheng, Handan
    Li, Li
    [J]. INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2020, 33 (11)
  • [24] Context-Aware Layer Scheduling for Seamless Neural Network Inference in Cloud-Edge Systems
    Stammler, Matthias
    Sidorenko, Vladimir
    Kress, Fabian
    Schmidt, Patrick
    Becker, Juergen
    [J]. 2023 IEEE 16TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP, MCSOC, 2023, : 97 - 104
  • [25] TABDeep: A two-level action branch architecture-based deep reinforcement learning for distributed sub-tree scheduling of online multicast sessions in EON
    Li, Xia
    Wang, Yuping
    [J]. COMPUTER NETWORKS, 2024, 243
  • [26] FlexACC: A Programmable Accelerator with Application-Specific ISA for Flexible Deep Neural Network Inference
    Yang, En-Yu
    Jia, Tianyu
    Brooks, David
    Wei, Gu-Yeon
    [J]. 2021 IEEE 32ND INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2021), 2021, : 266 - 273
  • [27] Memristive-based Mixed-signal CGRA for Accelerating Deep Neural Network Inference
    Kazerooni-Zand, Reza
    Kamal, Mehdi
    Afzali-Kusha, Ali
    Pedram, Massoud
    [J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (04)
  • [28] Circuit techniques for efficient acceleration of Deep Neural Network Inference with Analog-AI (Invited)
    Hosokawa, Kohji
    Narayanan, Pritish
    Ambrogio, Stefano
    Tsai, Hsinyu
    Mackin, Charles
    Fasoli, Andrea
    Friz, Alexander
    Chen, An
    Luquin, Jose
    Spoon, Katherine
    Burr, Geoffrey W.
    Lewis, Scott C.
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [29] Scheduling of graph neural network and Markov based UAV mobile edge computing networks
    Zhang, Ying
    Xiu, Supu
    Cai, Yiqing
    Ren, Pengshan
    [J]. PHYSICAL COMMUNICATION, 2023, 60
  • [30] Server temperature prediction using deep neural networks to assist thermal-aware scheduling
    Akbar, Saeed
    Li, Ruixuan
    Waqas, Muhammad
    Jan, Avais
    [J]. SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2022, 36