Adaptive Optimal Surrounding Control of Multiple Unmanned Surface Vessels via Actor-Critic Reinforcement Learning

被引:1
作者
Lu, Renzhi [1 ,2 ,3 ,4 ]
Wang, Xiaotao [5 ]
Ding, Yiyu [5 ]
Zhang, Hai-Tao [6 ,7 ]
Zhao, Feng [8 ]
Zhu, Lijun [9 ]
He, Yong [10 ,11 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Key Lab Image Proc & Intelligent Control, Wuhan 430074, Peoples R China
[2] Minist Educ, Key Lab Ind Internet Things & Networked Control, Chongqing 400065, Peoples R China
[3] Chongqing Univ, State Key Laboratoryof Mech Transmiss Adv Equipme, Chongqing 400044, Peoples R China
[4] Hubei Key Lab Adv Control & Intelligent Automat C, Wuhan 430074, Peoples R China
[5] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[6] Huazhong Univ Sci & Technol, Inst Artificial Intelligence, MOE Engn Res Ctr Autonomous Intelligent Unmanned, Sch Artificial Intelligence & Automat,State Key L, Wuhan 430074, Peoples R China
[7] Guangdong HUST Ind Technol Res Inst, Guangdong Prov Engn Technol Res Ctr Autonomous Un, Dongguan 523808, Peoples R China
[8] China Ship Sci Res Ctr, Wuxi 214082, Peoples R China
[9] Huazhong Univ Sci & Technol, MOE Engn Res Ctr Autonomous Intelligent Unmanned, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[10] China Univ Geosci, Sch Automat, Hubei Key Lab Adv Control & Intelligent Automat, Wuhan 430074, Peoples R China
[11] China Univ Geosci, Minist Educ, Engn Res Ctr Intelligent Technol Geoexplorat, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Actor-critic networks; Lyapunov functions; reinforcement learning (RL); surrounding control; unmanned surface vessels (USVs); MULTIAGENT SYSTEMS; AVOIDANCE;
D O I
10.1109/TNNLS.2024.3474289
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, an optimal surrounding control algorithm is proposed for multiple unmanned surface vessels (USVs), in which actor-critic reinforcement learning (RL) is utilized to optimize the merging process. Specifically, the multiple-USV optimal surrounding control problem is first transformed into the Hamilton-Jacobi-Bellman (HJB) equation, which is difficult to solve due to its nonlinearity. An adaptive actor-critic RL control paradigm is then proposed to obtain the optimal surround strategy, wherein the Bellman residual error is utilized to construct the network update laws. Particularly, a virtual controller representing intermediate transitions and an actual controller operating on a dynamics model are employed as surrounding control solutions for second-order USVs; thus, optimal surrounding control of the USVs is guaranteed. In addition, the stability of the proposed controller is analyzed by means of Lyapunov theory functions. Finally, numerical simulation results demonstrate that the proposed actor-critic RL-based surrounding controller can achieve the surrounding objective while optimizing the evolution process and obtains 9.76% and 20.85% reduction in trajectory length and energy consumption compared with the existing controller.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Optimal Containment Control of Multiple Quadrotors via Reinforcement Learning
    Cheng, Ming
    Liu, Hao
    Liu, Deyuan
    Gu, Haibo
    Wang, Xiangke
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2632 - 2637
  • [32] Optimizing EV Charging Station Placement in New South Wales: A Soft Actor-Critic Reinforcement Learning Approach
    Huang, Jinyi
    Zhou, Xiaozhou
    2024 5TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATION, ICCEA 2024, 2024, : 1790 - 1794
  • [33] Containment control of multiple unmanned surface vessels with NN control via reconfigurable hierarchical topology
    Liu, Wei
    Teng, Fei
    Xiao, Huiyu
    Wang, Chen
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2023, 17
  • [34] Second-Order Linear Multi-Agent Formation Control Based on Fuzzy Logic System Approximator and Actor-Critic Reinforcement Learning
    Zhang, Zipeng
    Huang, Jie
    Cai, Fenghuang
    Chen, Jian
    Chen, Yutao
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 4918 - 4923
  • [35] Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning
    Stankovic, Milos S.
    Beko, Marko
    Ilic, Nemanja
    Stankovic, Srdjan S.
    EUROPEAN JOURNAL OF CONTROL, 2023, 74
  • [36] Receding Horizon Actor-Critic Learning Control for Nonlinear Time-Delay Systems With Unknown Dynamics
    Liu, Jiahang
    Zhang, Xinglong
    Xu, Xin
    Xiong, Quan
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (08): : 4980 - 4993
  • [37] Formation Controller and Reinforcement Learning Algorithm in Multiple Surface Vessels
    Dao Phuong Nam
    Dang Van Trong
    Pham Dinh Duong
    Nguyen Hong Quang
    NEXT GENERATION OF INTERNET OF THINGS, 2023, 445 : 529 - 535
  • [38] Robust Adaptive Control for a Small Unmanned Helicopter Using Reinforcement Learning
    Xian, Bin
    Zhang, Xu
    Zhang, Haonan
    Gu, Xun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7589 - 7597
  • [39] Actor-Critic Learning Control Based on l2-Regularized Temporal-Difference Prediction With Gradient Correction
    Li, Luntong
    Li, Dazi
    Song, Tianheng
    Xu, Xin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 5899 - 5909
  • [40] Joint Optimization of Caching, Computing, and Radio Resources for Fog-Enabled IoT Using Natural Actor-Critic Deep Reinforcement Learning
    Wei, Yifei
    Yu, F. Richard
    Song, Mei
    Han, Zhu
    IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (02) : 2061 - 2073