Optimizing UAV-UGV coalition operations: A hybrid clustering and multi-agent reinforcement learning approach for path planning in obstructed environment

被引:9
作者
Brotee, Shamyo [1 ]
Kabir, Farhan [1 ]
Razzaque, Md. Abdur [1 ]
Roy, Palash [2 ]
Mamun-Or-Rashid, Md. [1 ]
Hassan, Md. Rafiul [3 ]
Hassan, Mohammad Mehedi [4 ]
机构
[1] Univ Dhaka, Green Networking Res Grp, Dept Comp Sci & Engn, Dhaka, Bangladesh
[2] Green Univ Bangladesh, Dept Comp Sci & Engn, Dhaka, Bangladesh
[3] Univ Maine Presque Isle, Coll Arts & Sci, Presque Isle, ME 04769 USA
[4] King Saud Univ, Coll Comp & Informat Sci, Dept Informat Syst, Riyadh, Saudi Arabia
关键词
UAV-UGV coalition; Path planning; Multi-agent deep reinforcement learning; Mean-shift clustering; Obstructed environment;
D O I
10.1016/j.adhoc.2024.103519
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the most critical applications undertaken by Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) is reaching predefined targets by following the most time-efficient routes while avoiding collisions. Unfortunately, UAVs are hampered by limited battery life, and UGVs face challenges in reachability due to obstacles and elevation variations, which is why a coalition of UAVs and UGVs can be highly effective. Existing literature primarily focuses on one-to-one coalitions, which constrains the efficiency of reaching targets. In this work, we introduce a novel approach for a UAV-UGV coalition with a variable number of vehicles, employing a modified mean-shift clustering algorithm (MEANCRFT) to segment targets into multiple zones. This approach of assigning targets to various circular zones based on density and range significantly reduces the time required to reach these targets. Moreover, introducing variability in the number of UAVs and UGVs in a coalition enhances task efficiency by enabling simultaneous multi-target engagement. In our approach, each vehicle of the coalitions is trained using two advanced deep reinforcement learning algorithms in two separate experiments, namely Multi-agent Deep Deterministic Policy Gradient (MADDPG) and Multi- agent Proximal Policy Optimization (MAPPO). The results of our experimental evaluation demonstrate that the proposed MEANCRFT-MADDPG method substantially surpasses current state-of-the-art techniques, nearly doubling efficiency in terms of target navigation time and task completion rate.
引用
收藏
页数:16
相关论文
共 34 条
[31]  
Yan C, 2018, 2018 2ND INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION SCIENCES (ICRAS), P46, DOI 10.1109/ICRAS.2018.8443226
[32]  
Yu Chao, 2022, Advances in Neural Information Processing Systems (NeurlPS)
[33]  
Yu Wang, 2023, 2023 IEEE 39th International Conference on Data Engineering (ICDE), P1790, DOI 10.1109/ICDE55515.2023.00140
[34]   IADRL: Imitation Augmented Deep Reinforcement Learning Enabled UGV-UAV Coalition for Tasking in Complex Environments [J].
Zhang, Jian ;
Yu, Zhitao ;
Mao, Shiwen ;
Periaswamy, Senthilkumar C. G. ;
Patton, Justin ;
Xia, Xue .
IEEE ACCESS, 2020, 8 :102335-102347