Two-Level Scheduling Algorithms for Deep Neural Network Inference in Vehicular Networks

被引：3

作者：

Wu, Yalan ^{[1
,2
]}

Wu, Jigang ^{[3
]}

Yao, Mianyang ^{[3
]}

Liu, Bosheng ^{[3
]}

Chen, Long ^{[3
]}

Lam, Siew Kei ^{[2
]}

机构：

[1] Guangdong Univ Technol, Sch Integrated Circuits, Guangzhou 510006, Peoples R China

[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore

[3] Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou 510006, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2023年 / 24卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Vehicular network; two-level task scheduling; DNN inference; quality of computing services; accelerator; RESOURCE-ALLOCATION; EDGE; ACCELERATION;

D O I：

10.1109/TITS.2023.3266795

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

In vehicular networks, task scheduling at the microarchitecture-level and network-level offers tremendous potential to improve the quality of computing services for deep neural network (DNN) inference. However, existing task scheduling works only focus on either one of the two levels, which results in inefficient utilization of computing resources. This paper aims to fill this gap by formulating a two-level scheduling problem for DNN inference tasks in a vehicular network, with an objective of minimizing total weighted sum of response time and energy consumption for all tasks under the following constraints: per task response time, per vehicle energy consumption, per vehicle storage capacity. We first formulate the problem and prove that it is NP-hard. A group transformation based algorithm, called GTA, is proposed. GTA makes scheduling decisions at the network-level using the group transformation based approach, and at the microarchitecture-level using a greedy strategy. In addition, an algorithm, denoted as DRL, is proposed to decrease total weighted sum of response time and energy consumption for all tasks. DRL trains two models with deep reinforcement learning to achieve two-level scheduling. The proposed algorithms are evaluated on a platform consisting of a desktop, Raspberry Pi, Eyeriss, OSM, SUMO, NS-3. Simulation results show that DRL outperforms the state-of-the-art methods for all cases, while the proposed GTA outperforms the state-ofthe-art methods for most cases, in terms of total weighted sum of response time and energy consumption. Compared with four baseline algorithms, GTA and DRL reduce the total weighted sum of response time and energy consumption by 41.49% and 62.38%, on average respectively, for different numbers of tasks.

引用

页码：9324 / 9343

页数：20

共 50 条

[21] PIE: A Pipeline Energy-efficient Accelerator for Inference Process in Deep Neural Networks
Zhao, Yangyang
Yu, Qi
Zhou, Xuda
Zhou, Xuehai
Wang, Chao
Li, Xi
[J]. 2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 1067 - 1074
[22] TRIM: A Design Space Exploration Model for Deep Neural Networks Inference and Training Accelerators
Qi, Yangjie
Zhang, Shuo
Taha, Tarek M.
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (05) : 1648 - 1661
[23] Joint backhaul bandwidth and power allocation in heterogeneous cellular networks: A two-level approach
Zheng, Handan
Li, Li
[J]. INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2020, 33 (11)
[24] Context-Aware Layer Scheduling for Seamless Neural Network Inference in Cloud-Edge Systems
Stammler, Matthias
Sidorenko, Vladimir
Kress, Fabian
Schmidt, Patrick
Becker, Juergen
[J]. 2023 IEEE 16TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP, MCSOC, 2023, : 97 - 104
[25] TABDeep: A two-level action branch architecture-based deep reinforcement learning for distributed sub-tree scheduling of online multicast sessions in EON
Li, Xia
Wang, Yuping
[J]. COMPUTER NETWORKS, 2024, 243
[26] FlexACC: A Programmable Accelerator with Application-Specific ISA for Flexible Deep Neural Network Inference
Yang, En-Yu
Jia, Tianyu
Brooks, David
Wei, Gu-Yeon
[J]. 2021 IEEE 32ND INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2021), 2021, : 266 - 273
[27] Memristive-based Mixed-signal CGRA for Accelerating Deep Neural Network Inference
Kazerooni-Zand, Reza
Kamal, Mehdi
Afzali-Kusha, Ali
Pedram, Massoud
[J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (04)
[28] Circuit techniques for efficient acceleration of Deep Neural Network Inference with Analog-AI (Invited)
Hosokawa, Kohji
Narayanan, Pritish
Ambrogio, Stefano
Tsai, Hsinyu
Mackin, Charles
Fasoli, Andrea
Friz, Alexander
Chen, An
Luquin, Jose
Spoon, Katherine
Burr, Geoffrey W.
Lewis, Scott C.
[J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[29] Scheduling of graph neural network and Markov based UAV mobile edge computing networks
Zhang, Ying
Xiu, Supu
Cai, Yiqing
Ren, Pengshan
[J]. PHYSICAL COMMUNICATION, 2023, 60
[30] Server temperature prediction using deep neural networks to assist thermal-aware scheduling
Akbar, Saeed
Li, Ruixuan
Waqas, Muhammad
Jan, Avais
[J]. SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2022, 36

← 1 2 3 4 5 →