Multi-Agent Reinforcement Learning Resources Allocation Method Using Dueling Double Deep Q-Network in Vehicular Networks

被引：22

作者：

Ji, Yuxin ^{[1
]}

Wang, Yu ^{[1
]}

Zhao, Haitao

Gui, Guan ^{[1
]}

Gacanin, Haris ^{[2
]}

Sari, Hikmet ^{[1
]}

Adachi, Fumiyuki ^{[3
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Coll Telecommun & Informat Engn, Nanjing 210003, Peoples R China

[2] Rhein Westfal TH Aachen, Inst Commun Technol & Embedded Syst, D-52062 Aachen, Germany

[3] Tohoku Univ, Int Res Inst Disaster Sci IRIDeS, Sendai, Miyagi 9808577, Japan

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2023年 / 72卷 / 10期

关键词：

Internet of vehicles; transmit power; spectrum allocation; multi-agent reinforcement learning; LATENCY; RELIABILITY; CHALLENGES;

D O I：

10.1109/TVT.2023.3275546

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The communications between vehicle-to-vehicle (V2V) with high frequency, group sending, group receiving and periodic lead to serious collision of wireless resources and limited system capacity, and the rapid channel changes in high mobility vehicular environments preclude the possibility of collecting accurate instantaneous channel state information at the base station for centralized resource management. For the Internet of Vehicles (IoV), it is a fundamental challenge to achieve low latency and high reliability communication for real-time data interaction over short distances in a complex wireless propagation environment, as well as to attenuate and avoid inter-vehicle interference in the region through a reasonable spectrum allocation. To solve the above problems, this paper proposes a resource allocation (RA) method using dueling double deep Q-network reinforcement learning (RL) with low-dimensional fingerprints and soft-update architecture (D3QN-LS) while constructing a multi-agent model based on a Manhattan grid layout urban virtual environment, with communication links between V2V links acting as agents to reuse vehicle-to-infrastructure (V2I) spectrum resources. In addition, we extend the amount of transmitted data in our work, while adding scenarios where spectrum resources are relatively scarce, i.e. the number of V2V links is significantly larger than the amount of spectrum, to compensate for some of the shortcomings in existing literature studies. We demonstrate that the proposed D3QN-LS algorithm leads to a further improvement in the total capacity of V2I links and the success rate of periodic secure message transmission in V2V links.

引用

页码：13447 / 13460

页数：14

共 44 条

[1]

3GPP, 2016, 3GPP TR 36.885 V14.0.0 Release 14

[2] Optimized Age of Information Tail for Ultra-Reliable Low-Latency Communications in Vehicular Networks [J].