Underactuated MSV path following control via stable adversarial inverse reinforcement learning

被引:2
|
作者
Li, Lingyu [1 ,2 ,3 ,4 ,5 ]
Ma, Yong [1 ,2 ,3 ,4 ,5 ]
Wu, Defeng [6 ]
机构
[1] Wuhan Univ Technol, State Key Lab Waterway Traff Control & Safety, Wuhan 430063, Hubei, Peoples R China
[2] Wuhan Univ Technol, Sch Nav, Wuhan 430063, Hubei, Peoples R China
[3] Natl Engn Res Ctr Water Transport Safety, Wuhan 430063, Hubei, Peoples R China
[4] Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Hainan, Peoples R China
[5] Wuhan Univ Technol, Chongqing Res Inst, Chongqing 401120, Peoples R China
[6] Jimei Univ, Sch Marine Engn, Xiamen 361021, Fujian, Peoples R China
基金
美国国家科学基金会;
关键词
Underactuated marine surface vehicle; Path-following; Inverse reinforcement learning; Imitation learning; TRACKING CONTROL; USV;
D O I
10.1016/j.oceaneng.2024.117368
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
Model -based control approaches are inadequate to solve the marine surface vehicle (MSV) path-following problem, especially under adverse environments. To effectively deal with the MSV path-following problem, model-free deep reinforcement learning (DRL) based methods have been developed. However, defining an efficient reward function for DRL in path following tasks is rather difficult. Providing expert demonstration is often easier than designing effective reward functions. Thus, we propose a model-free stable adversarial inverse reinforcement learning (SAIRL) algorithm that only adopts the state of MSV and reconstructs the reward function from the expert demonstration. The SAIRL algorithm is designed to guarantee the prescribed MSV path following accuracy and training stability. It utilizes an alternative loss function and dual-discriminator framework to dissolve the issue of policy collapse, which arises due to the vanishing gradient of the discriminator. Simulations and experiments have validated that the SAIRL algorithm outperforms other baseline algorithms in terms of path-following accuracy and stability of convergence.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Path following control of underactuated autonomous underwater vehicles
    Cui, S.-P., 1600, Editorial Department of Electric Machines and Control (17):
  • [42] Reinforcement Learning in Discrete Neural Control of the Underactuated System
    Hendzel, Zenon
    Burghardt, Andrzej
    Szuster, Marcin
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT I, 2013, 7894 : 64 - 75
  • [43] Path Following Control for UAV Using Deep Reinforcement Learning Approach
    Yintao Zhang
    Youmin Zhang
    Ziquan Yu
    Guidance,Navigation and Control, 2021, (01) : 95 - 112
  • [44] Predicting Trust in Human Control of Swarms via Inverse Reinforcement Learning
    Nam, Changjoo
    Walker, Phillip
    Lewis, Michael
    Sycara, Katia
    2017 26TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2017, : 528 - 533
  • [45] Adaptive generative adversarial maximum entropy inverse reinforcement learning
    Song, Li
    Li, Dazi
    Xu, Xin
    INFORMATION SCIENCES, 2025, 695
  • [46] Online inverse reinforcement learning for nonlinear systems with adversarial attacks
    Lian, Bosen
    Xue, Wenqian
    Lewis, Frank L.
    Chai, Tianyou
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (14) : 6646 - 6667
  • [47] Modeling Driver Behavior using Adversarial Inverse Reinforcement Learning
    Sackmann, Moritz
    Bey, Henrik
    Hofmann, Ulrich
    Thielecke, Joern
    2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 1683 - 1690
  • [48] Multi-task Hierarchical Adversarial Inverse Reinforcement Learning
    Chen, Jiayu
    Tamboli, Dipesh
    Lan, Tian
    Aggarwal, Vaneet
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [49] InitLight: Initial Model Generation for Traffic Signal Control Using Adversarial Inverse Reinforcement Learning
    Ye, Yutong
    Zhou, Yingbo
    Ding, Jiepin
    Wang, Ting
    Chen, Mingsong
    Lian, Xiang
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4949 - 4958
  • [50] USV Path-Following Control Based On Deep Reinforcement Learning and Adaptive Control
    Gonzalez-Garcia, Alejandro
    Castaneda, Herman
    Garrido, Leonardo
    GLOBAL OCEANS 2020: SINGAPORE - U.S. GULF COAST, 2020,