Reinforcement Learning for Joint V2I Network Selection and Autonomous Driving Policies

被引：7

作者：

Yan, Zijiang ^{[1
]}

Tabassum, Hina ^{[1
]}

机构：

[1] York Univ, Dept Elect Engn & Comp Sci, Toronto, ON, Canada

来源：

2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022) | 2022年

基金：

加拿大自然科学与工程研究理事会;

关键词：

Autonomous driving; reinforcement learning; multi-band network selection; resource allocation;

D O I：

10.1109/GLOBECOM48099.2022.10001396

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Vehicle-to-Infrastructure (V2I) communication is becoming critical for the enhanced reliability of autonomous vehicles (AVs). However, the uncertainties in the road-traffic and AVs' wireless connections can severely impair timely decisionmaking. It is thus critical to simultaneously optimize the AVs' network selection and driving policies in order to minimize road collisions while maximizing the communication data rates. In this paper, we develop a reinforcement learning (RL) framework to characterize efficient network selection and autonomous driving policies in a multi-band vehicular network (VNet) operating on conventional sub-6GHz spectrum and Terahertz (THz) frequencies. The proposed framework is designed to (i) maximize the traffic flow and minimize collisions by controlling the vehicle's motion dynamics (i.e., speed and acceleration) from autonomous driving perspective, and (ii) maximize the data rates and minimize handoffs by jointly controlling the vehicle's motion dynamics and network selection from telecommunication perspective. We cast this problem as a Markov Decision Process (MDP) and develop a deep Q-learning based solution to optimize the actions such as acceleration, deceleration, lane-changes, and AV-base station assignments for a given AV's state. The AV's state is defined based on the velocities and communication channel states of AVs. Numerical results demonstrate interesting insights related to the inter-dependency of vehicle's motion dynamics, handoffs, and the communication data rate. The proposed policies enable AVs to adopt safe driving behaviors with improved connectivity.

引用

页码：1241 / 1246

页数：6

共 16 条

[1] Chen H.-M., 2022, WIRELESS COMMUN MOBI
[2] Sigmoid-weighted linear units for neural network function approximation in reinforcement learning
Elfwing, Stefan
Uchibe, Eiji
Doya, Kenji
[J]. NEURAL NETWORKS, 2018, 107 : 3 - 11
[3] Mobility-Aware Performance in Hybrid RF and Terahertz Wireless Networks
Hossan, Md Tanvir
Tabassum, Hina
[J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2022, 70 (02) : 1376 - 1390
[4] Hou YN, 2017, IEEE SYS MAN CYBERN, P316, DOI 10.1109/SMC.2017.8122622
[5] V2X-Based Vehicular Positioning: Opportunities, Challenges, and Future Directions
Ko, Seung-Woo
Chae, Hyukjin
Han, Kaifeng
Lee, Seungmin
Seo, Dong-Wook
Huang, Kaibin
[J]. IEEE WIRELESS COMMUNICATIONS, 2021, 28 (02) : 144 - 151
[6] Deep Reinforcement Learning-Empowered Resource Allocation for Mobile Edge Computing in Cellular V2X Networks
Li, Dongji
Xu, Shaoyi
Li, Pengyu
[J]. SENSORS, 2021, 21 (02) : 1 - 18
[7] Li S., 2017, AA228 CS238 TRAINING
[8] Reinforcement Learning in V2I Communication Assisted Autonomous Driving
Liu, Xiao
Liu, Yuanwei
Chen, Yue
Wang, Luhan
Lu, Zhaoming
[J]. ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2020,
[9] Rasti M., 2022, IEEE WIRELESS COMMUN
[10] Interference and Coverage Analysis in Coexisting RF and Dense TeraHertz Wireless Networks
Sayehvand, Javad
Tabassum, Hina
[J]. IEEE WIRELESS COMMUNICATIONS LETTERS, 2020, 9 (10) : 1738 - 1742

← 1 2 →