共 5 条
Multiagent Best Routing in High-Mobility Digital-Twin-Driven Internet of Vehicles (IoV)
被引:1
|作者:
Alam, Md. Zahangir
[1
,2
,3
]
Khan, Komal S.
[4
]
Jamalipour, Abbas
[1
]
机构:
[1] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
[2] Independent Univ, Dept Comp Sci & Engn, Dhaka 1229, Bangladesh
[3] Independent Univ, Ctr Computat & Data Sci, Dhaka 1229, Bangladesh
[4] Darktrace, Melbourne, Vic 3000, Australia
来源:
关键词:
Reliability;
Network topology;
Vehicle dynamics;
Delays;
Topology;
Digital twins;
Heuristic algorithms;
Dynamic graph;
Internet of Vehicle (IoV);
multiagent deep deterministic policy gradient (MADDPG);
multiagent learning;
stochastic process;
RESOURCE-ALLOCATION;
REINFORCEMENT;
RELIABILITY;
NETWORKS;
D O I:
10.1109/JIOT.2023.3338020
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Low-delay high-gain optimal multihop routing path is crucial to guarantee both the latency and reliability requirements for infotainment services in the high-mobility Internet of Vehicles (IoV) subject to queue stability. The high mobility in multihop IoV reduces reliability and energy efficiency, and becomes bottleneck for the optimal route solution using classical optimization methods. To a great extent, deep reinforcement learning (DRL)-based method is not applicable in IoV environment because of the continuously changing topology and space complexity, which grows exponentially with the number of state variables as well as the relaying hops. Usually, in multihop scenario, network reliability and latency are affected by mobility as well as average hop count, which limit the vehicle-to-vehicle (V2V) link connectivity. To cope with this problem, in this article, we formulate a minimum hop count delay-sensitive buffer-aided optimization problem in a dynamic complex multihop vehicular topology using a digital twin-enabled dynamic coordination graph (DCG). Particularly, for the first time, a DCG-based multiagent deep deterministic policy gradient (DCG-MADDPG) decentralized algorithm is proposed that combines the advantage of DCG and MADDPG to model continuously changing topology and find the optimal routing solutions by cooperative learning in the aforementioned communications. The proposed DCG-MADDPG coordinated learning trains each agent toward highly reliable and low-latency optimal decision-making path solutions while maintaining queue stability and convergence on the way to a desired state. Experimental results reveal that the proposed coordinated learning algorithm outperforms the existing learning in terms of energy consumption and latency at less computational complexity.
引用
收藏
页码:13708 / 13721
页数:14
相关论文