AoI-Aware Resource Allocation for Platoon-Based C-V2X Networks via Multi-Agent Multi-Task Reinforcement Learning

被引:32
|
作者
Parvini, Mohammad [1 ]
Javan, Mohammad Reza [2 ]
Mokari, Nader [1 ]
Abbasi, Bijan [1 ]
Jorswieck, Eduard A. [3 ]
机构
[1] Tarbiat Modares Univ, Dept Elect & Comp Engn, Tehran 1411713116, Iran
[2] Shahrood Univ Technol, Fac Elect Engn, Shahrood 3619995161, Iran
[3] TU Braunschweig, Inst Commun Technol, D-2338106 Braunschweig, Germany
关键词
Resource management; Cams; Long Term Evolution; Wireless communication; Vehicle dynamics; Task analysis; Interference; V2X; AoI; Platoon cooperation; MARL; MANAGEMENT; COMMUNICATION; VEHICLES;
D O I
10.1109/TVT.2023.3259688
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper investigates the problem of age of information (AoI) aware radio resource management for a platooning system. Multiple autonomous platoons exploit the cellular wireless vehicle-to-everything (C-V2X) communication technology to disseminate the cooperative awareness messages (CAMs) to their followers while ensuring timely delivery of safety-critical messages to the Road-Side Unit (RSU). To lower the computational load at the RSU and cope with the challenges of dynamic channel conditions, we exploit a distributed resource allocation framework based on multi-agent reinforcement learning (MARL), where each platoon leader (PL) acts as an agent and interacts with the environment to learn its optimal policy. Motivated by the existing literature in RL, we propose two novel MARL frameworks based on the multi-agent deep deterministic policy gradient (MADDPG), named Modified MADDPG, and Modified MADDPG with task decomposition. Both algorithms train two critics with the following goals: A global critic which estimates the global expected reward and motivates the agents toward a cooperating behavior and an exclusive local critic for each agent that estimates the local individual reward. Furthermore, based on the tasks each agent has to accomplish, in the second algorithm, the holistic individual reward of each agent is decomposed into multiple sub-reward functions where task-wise value functions are learned separately. Numerical results indicate our proposed algorithms' effectiveness compared with other contemporary RL frameworks, e.g., federated reinforcement learning (FRL) in terms of AoI performance and CAM message transmission probability.
引用
收藏
页码:9880 / 9896
页数:17
相关论文
共 43 条
  • [41] Policy network-based dual-agent deep reinforcement learning for multi-resource task offloading in multi-access edge cloud networks
    Feng, Chuan
    Xu, Zhang
    Han, Pengchao
    Ma, Tianchun
    Gong, Xiaoxue
    CHINA COMMUNICATIONS, 2024, 21 (04) : 53 - 73
  • [42] Multi-Agent Deep Reinforcement Learning Based Resource Allocation for Ultra-Reliable Low-Latency Internet of Controllable Things
    Xiao, Yang
    Song, Yuqian
    Liu, Jun
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2023, 22 (08) : 5414 - 5430
  • [43] Multi-agent deep reinforcement learning based resource management in SWIPT enabled cellular networks with H2H/M2M co-existence
    Li, Xuehua
    Wei, Xing
    Chen, Shuo
    Sun, Lixin
    AD HOC NETWORKS, 2023, 149