Dual-robot formation transport in random environment and narrow restricted area via improved DDPG navigation

被引：0

作者：

Tang, Liang ^{[1
]}

Ma, Ronggeng ^{[1
]}

Chen, Bowen ^{[1
]}

Niu, Yisen ^{[1
]}

机构：

[1] Hubei Univ Technol, Wuhan, Peoples R China

来源：

DISCOVER APPLIED SCIENCES | 2025年 / 7卷 / 03期

关键词：

Mobile robots navigation; Dual-robot transport; Deterministic policy gradient; Collision avoidance; Restricted environment; DYNAMIC WINDOW APPROACH; REINFORCEMENT; MANIPULATORS; AVOIDANCE;

D O I：

10.1007/s42452-025-06572-7

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

On the background of two robots transporting long and no deflection deformation rods, a dual-robot system model is established, and via optimized navigation method based on Deep Deterministic Policy Gradient (DDPG) to address the issues of poor navigation paths, and resolved the robot system encountered the blocked exits in narrow areas during collaborative transportation. The method aimed at generating optimized navigation paths for leader-follower robot formations to complete tasks such as moving objects, designing swap decision reward function for DDPG to resolving the problem of blocked exits in narrow areas. Specifically, the paper first optimizes the reward function module in the DDPG network to incorporate a decision-swapping reward mechanism for training the formation's navigation capability. Next, it utilizes Unscented Kalman Filter filtering to estimate the formation's position states for follower trajectory tracking. Finally, the navigation performance of the formation is validated through simulations. The results of the simulation experiments demonstrate that the formation can achieve a navigation success rate of around 95%, in random environments. Additionally, compared to paths generated by A*-DWA and RRT* algorithms, the trained DDPG navigation algorithm reduced the average angle variation of generated paths by 64.03% and 38.65%, respectively. Furthermore, in trap environments, the formation is capable of executing decision swapping to exit restricted areas and generate a passable path for the formation.

引用

页数：21

共 40 条

[1] A Biomimetical Dynamic Window Approach to Navigation for Collaborative Control
Ballesteros, Joaquin
Urdiales, Cristina
Martinez Velasco, Antonio B.
Ramos-Jimenez, Gonzalo
[J]. IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2017, 47 (06) : 1123 - 1133
[2] Analysis of Mobile Robot Control by Reinforcement Learning Algorithm
Bernat, Jakub
Czopek, Pawel
Bartosik, Szymon
[J]. ELECTRONICS, 2022, 11 (11)
[3] [曹政才 Cao Zhengcai], 2012, [电子学报, Acta Electronica Sinica], V40, P632
[4] Chang L, 2019, IEEE INT C NETW SENS, P257, DOI [10.1109/ICNSC.2019.8743249, 10.1109/icnsc.2019.8743249]
[5] Wheeled mobile robot design with robustness properties
Chen, Yung Yue
Chen, Yung Hsiang
Huang, Chiung Yau
[J]. ADVANCES IN MECHANICAL ENGINEERING, 2018, 10 (01):
[6] Decentralized Adaptive Control for Collaborative Manipulation of Rigid Bodies
Culbertson, Preston
Slotine, Jean-Jacques
Schwager, Mac
[J]. IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (06) : 1906 - 1920
[7] Dijkstra E. W., 1959, NUMERISCHE MATH, V1, P269, DOI [DOI 10.1007/BF01386390, 10.1007/BF01386390]
[8] Multi-scale assembly with robot teams
Dogar, Mehmet
Knepper, Ross A.
Spielberg, Andrew
Choi, Changhyun
Christensen, Henrik I.
Rus, Daniela
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2015, 34 (13) : 1645 - 1659
[9] Sampling-Based Robot Motion Planning: A Review
Elbanhawi, Mohamed
Simic, Milan
[J]. IEEE ACCESS, 2014, 2 : 56 - 77
[10] Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning
Fernandez-Gauna, Borja
Etxeberria-Agiriano, Ismael
Grana, Manuel
[J]. PLOS ONE, 2015, 10 (07):

← 1 2 3 4 →