Multi-Agent Reinforcement Learning Resources Allocation Method Using Dueling Double Deep Q-Network in Vehicular Networks

被引:22
作者
Ji, Yuxin [1 ]
Wang, Yu [1 ]
Zhao, Haitao
Gui, Guan [1 ]
Gacanin, Haris [2 ]
Sari, Hikmet [1 ]
Adachi, Fumiyuki [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Telecommun & Informat Engn, Nanjing 210003, Peoples R China
[2] Rhein Westfal TH Aachen, Inst Commun Technol & Embedded Syst, D-52062 Aachen, Germany
[3] Tohoku Univ, Int Res Inst Disaster Sci IRIDeS, Sendai, Miyagi 9808577, Japan
关键词
Internet of vehicles; transmit power; spectrum allocation; multi-agent reinforcement learning; LATENCY; RELIABILITY; CHALLENGES;
D O I
10.1109/TVT.2023.3275546
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The communications between vehicle-to-vehicle (V2V) with high frequency, group sending, group receiving and periodic lead to serious collision of wireless resources and limited system capacity, and the rapid channel changes in high mobility vehicular environments preclude the possibility of collecting accurate instantaneous channel state information at the base station for centralized resource management. For the Internet of Vehicles (IoV), it is a fundamental challenge to achieve low latency and high reliability communication for real-time data interaction over short distances in a complex wireless propagation environment, as well as to attenuate and avoid inter-vehicle interference in the region through a reasonable spectrum allocation. To solve the above problems, this paper proposes a resource allocation (RA) method using dueling double deep Q-network reinforcement learning (RL) with low-dimensional fingerprints and soft-update architecture (D3QN-LS) while constructing a multi-agent model based on a Manhattan grid layout urban virtual environment, with communication links between V2V links acting as agents to reuse vehicle-to-infrastructure (V2I) spectrum resources. In addition, we extend the amount of transmitted data in our work, while adding scenarios where spectrum resources are relatively scarce, i.e. the number of V2V links is significantly larger than the amount of spectrum, to compensate for some of the shortcomings in existing literature studies. We demonstrate that the proposed D3QN-LS algorithm leads to a further improvement in the total capacity of V2I links and the success rate of periodic secure message transmission in V2V links.
引用
收藏
页码:13447 / 13460
页数:14
相关论文
共 44 条
[1]  
3GPP, 2016, 3GPP TR 36.885 V14.0.0 Release 14
[2]   Optimized Age of Information Tail for Ultra-Reliable Low-Latency Communications in Vehicular Networks [J].
Abdel-Aziz, Mohamed K. ;
Samarakoon, Sumudu ;
Liu, Chen-Feng ;
Bennis, Mehdi ;
Saad, Walid .
IEEE TRANSACTIONS ON COMMUNICATIONS, 2020, 68 (03) :1911-1924
[3]  
Ahmed M. S., 2022, Journal of King Saud University-Computer and Information Sciences
[4]   Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications [J].
Al-Fuqaha, Ala ;
Guizani, Mohsen ;
Mohammadi, Mehdi ;
Aledhari, Mohammed ;
Ayyash, Moussa .
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2015, 17 (04) :2347-2376
[5]   A Survey on Multi-Agent Reinforcement Learning Methods for Vehicular Networks [J].
Althamary, Ibrahim ;
Huang, Chih-Wei ;
Lin, Phone .
2019 15TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2019, :1154-1159
[6]  
[Anonymous], 2017, TR 22.886 V15.1.0
[7]  
Ashraf MI, 2016, IEEE GLOBE WORK
[8]   Dynamic Resource Allocation for Optimized Latency and Reliability in Vehicular Networks [J].
Ashraf, Muhammad Ikram ;
Liu, Chen-Feng ;
Bennis, Mehdi ;
Saad, Walid ;
Hong, Choong Seon .
IEEE ACCESS, 2018, 6 :63843-63858
[9]   Convergence of MANET and WSN in IoT Urban Scenarios [J].
Bellavista, Paolo ;
Cardone, Giuseppe ;
Corradi, Antonio ;
Foschini, Luca .
IEEE SENSORS JOURNAL, 2013, 13 (10) :3558-3567
[10]  
Botsov M, 2014, IEEE WCNC, P1679, DOI 10.1109/WCNC.2014.6952482