Adaptive Optimal Surrounding Control of Multiple Unmanned Surface Vessels via Actor-Critic Reinforcement Learning

被引:1
作者
Lu, Renzhi [1 ,2 ,3 ,4 ]
Wang, Xiaotao [5 ]
Ding, Yiyu [5 ]
Zhang, Hai-Tao [6 ,7 ]
Zhao, Feng [8 ]
Zhu, Lijun [9 ]
He, Yong [10 ,11 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Key Lab Image Proc & Intelligent Control, Wuhan 430074, Peoples R China
[2] Minist Educ, Key Lab Ind Internet Things & Networked Control, Chongqing 400065, Peoples R China
[3] Chongqing Univ, State Key Laboratoryof Mech Transmiss Adv Equipme, Chongqing 400044, Peoples R China
[4] Hubei Key Lab Adv Control & Intelligent Automat C, Wuhan 430074, Peoples R China
[5] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[6] Huazhong Univ Sci & Technol, Inst Artificial Intelligence, MOE Engn Res Ctr Autonomous Intelligent Unmanned, Sch Artificial Intelligence & Automat,State Key L, Wuhan 430074, Peoples R China
[7] Guangdong HUST Ind Technol Res Inst, Guangdong Prov Engn Technol Res Ctr Autonomous Un, Dongguan 523808, Peoples R China
[8] China Ship Sci Res Ctr, Wuxi 214082, Peoples R China
[9] Huazhong Univ Sci & Technol, MOE Engn Res Ctr Autonomous Intelligent Unmanned, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[10] China Univ Geosci, Sch Automat, Hubei Key Lab Adv Control & Intelligent Automat, Wuhan 430074, Peoples R China
[11] China Univ Geosci, Minist Educ, Engn Res Ctr Intelligent Technol Geoexplorat, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Actor-critic networks; Lyapunov functions; reinforcement learning (RL); surrounding control; unmanned surface vessels (USVs); MULTIAGENT SYSTEMS; AVOIDANCE;
D O I
10.1109/TNNLS.2024.3474289
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, an optimal surrounding control algorithm is proposed for multiple unmanned surface vessels (USVs), in which actor-critic reinforcement learning (RL) is utilized to optimize the merging process. Specifically, the multiple-USV optimal surrounding control problem is first transformed into the Hamilton-Jacobi-Bellman (HJB) equation, which is difficult to solve due to its nonlinearity. An adaptive actor-critic RL control paradigm is then proposed to obtain the optimal surround strategy, wherein the Bellman residual error is utilized to construct the network update laws. Particularly, a virtual controller representing intermediate transitions and an actual controller operating on a dynamics model are employed as surrounding control solutions for second-order USVs; thus, optimal surrounding control of the USVs is guaranteed. In addition, the stability of the proposed controller is analyzed by means of Lyapunov theory functions. Finally, numerical simulation results demonstrate that the proposed actor-critic RL-based surrounding controller can achieve the surrounding objective while optimizing the evolution process and obtains 9.76% and 20.85% reduction in trajectory length and energy consumption compared with the existing controller.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Mutli-agent consensus under communication failure using Actor-Critic Reinforcement Learning
    Kandath, Harikumar
    Senthilnath, J.
    Sundaram, Suresh
    2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 1461 - 1465
  • [22] Tunnel ventilation control via an actor-critic algorithm employing nonparametric policy gradients
    Chu, Baeksuk
    Hong, Daehie
    Park, Jooyoung
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2009, 23 (02) : 311 - 323
  • [23] Advancements in UAV Path Planning: A Deep Reinforcement Learning Approach with Soft Actor-Critic for Enhanced Navigation
    Guo, Jingrui
    Zhou, Guanzhong
    Huang, Hailong
    Huang, Chao
    UNMANNED SYSTEMS, 2024,
  • [24] Optimal Tracking Control of Heterogeneous Multi-agent Systems with Switching Topology via Actor-Critic Neural Networks
    Peng, Zhinan
    Hu, Jiangping
    Ghosh, Bijoy K.
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 7037 - 7042
  • [25] Tunnel ventilation control via an actor-critic algorithm employing nonparametric policy gradients
    Baeksuk Chu
    Daehie Hong
    Jooyoung Park
    Journal of Mechanical Science and Technology, 2009, 23 : 311 - 323
  • [26] A Robust Mean-Field Actor-Critic Reinforcement Learning Against Adversarial Perturbations on Agent States
    Zhou, Ziyuan
    Liu, Guanjun
    Zhou, Mengchu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14370 - 14381
  • [27] Bioinspired actor-critic algorithm for reinforcement learning interpretation with Levy-Brown hybrid exploration strategy
    Wang, Xiao
    Li, Dazi
    NEUROCOMPUTING, 2024, 574
  • [28] Bearing-Only Motional Target-Surrounding Control for Multiple Unmanned Surface Vessels
    Hu, Bin-Bin
    Zhang, Hai-Tao
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2022, 69 (04) : 3988 - 3997
  • [29] Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors
    Duan, Jingliang
    Guan, Yang
    Li, Shengbo Eben
    Ren, Yangang
    Sun, Qi
    Cheng, Bo
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) : 6584 - 6598
  • [30] Event-Triggered Optimal Tracking Control for Underactuated Surface Vessels via Neural Reinforcement Learning
    Liu, Xiang
    Yan, Huaicheng
    Zhou, Weixiang
    Wang, Ning
    Wang, Yueying
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (11) : 12837 - 12847