AoI-Aware Resource Allocation for Platoon-Based C-V2X Networks via Multi-Agent Multi-Task Reinforcement Learning

被引：32

作者：

Parvini, Mohammad ^{[1
]}

Javan, Mohammad Reza ^{[2
]}

Mokari, Nader ^{[1
]}

Abbasi, Bijan ^{[1
]}

Jorswieck, Eduard A. ^{[3
]}

机构：

[1] Tarbiat Modares Univ, Dept Elect & Comp Engn, Tehran 1411713116, Iran

[2] Shahrood Univ Technol, Fac Elect Engn, Shahrood 3619995161, Iran

[3] TU Braunschweig, Inst Commun Technol, D-2338106 Braunschweig, Germany

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2023年 / 72卷 / 08期

关键词：

Resource management; Cams; Long Term Evolution; Wireless communication; Vehicle dynamics; Task analysis; Interference; V2X; AoI; Platoon cooperation; MARL; MANAGEMENT; COMMUNICATION; VEHICLES;

D O I：

10.1109/TVT.2023.3259688

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper investigates the problem of age of information (AoI) aware radio resource management for a platooning system. Multiple autonomous platoons exploit the cellular wireless vehicle-to-everything (C-V2X) communication technology to disseminate the cooperative awareness messages (CAMs) to their followers while ensuring timely delivery of safety-critical messages to the Road-Side Unit (RSU). To lower the computational load at the RSU and cope with the challenges of dynamic channel conditions, we exploit a distributed resource allocation framework based on multi-agent reinforcement learning (MARL), where each platoon leader (PL) acts as an agent and interacts with the environment to learn its optimal policy. Motivated by the existing literature in RL, we propose two novel MARL frameworks based on the multi-agent deep deterministic policy gradient (MADDPG), named Modified MADDPG, and Modified MADDPG with task decomposition. Both algorithms train two critics with the following goals: A global critic which estimates the global expected reward and motivates the agents toward a cooperating behavior and an exclusive local critic for each agent that estimates the local individual reward. Furthermore, based on the tasks each agent has to accomplish, in the second algorithm, the holistic individual reward of each agent is decomposed into multiple sub-reward functions where task-wise value functions are learned separately. Numerical results indicate our proposed algorithms' effectiveness compared with other contemporary RL frameworks, e.g., federated reinforcement learning (FRL) in terms of AoI performance and CAM message transmission probability.

引用

页码：9880 / 9896

页数：17

共 43 条

[41] Policy network-based dual-agent deep reinforcement learning for multi-resource task offloading in multi-access edge cloud networks
Feng, Chuan
Xu, Zhang
Han, Pengchao
Ma, Tianchun
Gong, Xiaoxue
CHINA COMMUNICATIONS, 2024, 21 (04) : 53 - 73
[42] Multi-Agent Deep Reinforcement Learning Based Resource Allocation for Ultra-Reliable Low-Latency Internet of Controllable Things
Xiao, Yang
Song, Yuqian
Liu, Jun
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2023, 22 (08) : 5414 - 5430
[43] Multi-agent deep reinforcement learning based resource management in SWIPT enabled cellular networks with H2H/M2M co-existence
Li, Xuehua
Wei, Xing
Chen, Shuo
Sun, Lixin
AD HOC NETWORKS, 2023, 149

← 1 2 3 4 5 →