Underactuated MSV path following control via stable adversarial inverse reinforcement learning

被引:2
|
作者
Li, Lingyu [1 ,2 ,3 ,4 ,5 ]
Ma, Yong [1 ,2 ,3 ,4 ,5 ]
Wu, Defeng [6 ]
机构
[1] Wuhan Univ Technol, State Key Lab Waterway Traff Control & Safety, Wuhan 430063, Hubei, Peoples R China
[2] Wuhan Univ Technol, Sch Nav, Wuhan 430063, Hubei, Peoples R China
[3] Natl Engn Res Ctr Water Transport Safety, Wuhan 430063, Hubei, Peoples R China
[4] Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Hainan, Peoples R China
[5] Wuhan Univ Technol, Chongqing Res Inst, Chongqing 401120, Peoples R China
[6] Jimei Univ, Sch Marine Engn, Xiamen 361021, Fujian, Peoples R China
基金
美国国家科学基金会;
关键词
Underactuated marine surface vehicle; Path-following; Inverse reinforcement learning; Imitation learning; TRACKING CONTROL; USV;
D O I
10.1016/j.oceaneng.2024.117368
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
Model -based control approaches are inadequate to solve the marine surface vehicle (MSV) path-following problem, especially under adverse environments. To effectively deal with the MSV path-following problem, model-free deep reinforcement learning (DRL) based methods have been developed. However, defining an efficient reward function for DRL in path following tasks is rather difficult. Providing expert demonstration is often easier than designing effective reward functions. Thus, we propose a model-free stable adversarial inverse reinforcement learning (SAIRL) algorithm that only adopts the state of MSV and reconstructs the reward function from the expert demonstration. The SAIRL algorithm is designed to guarantee the prescribed MSV path following accuracy and training stability. It utilizes an alternative loss function and dual-discriminator framework to dissolve the issue of policy collapse, which arises due to the vanishing gradient of the discriminator. Simulations and experiments have validated that the SAIRL algorithm outperforms other baseline algorithms in terms of path-following accuracy and stability of convergence.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Path Following Control for Underactuated Surface Vessel with Disturbance
    Huang, Hongyun
    Fan, Yunsheng
    PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 3265 - 3269
  • [32] Spatial Path Following Control of an Autonomous Underactuated Airship
    Wei-Xiang Zhou
    Chang Xiao
    Ping-Fang Zhou
    Deng-Ping Duan
    International Journal of Control, Automation and Systems, 2019, 17 : 1726 - 1737
  • [33] ADRC path-following control of underactuated AUVs
    Wan, Lei
    Zhang, Ying-Hao
    Sun, Yu-Shan
    Li, Yue-Ming
    He, Bin
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2014, 48 (12): : 1727 - 1731
  • [34] Synchronized Path Following Control for Multiple Underactuated AUVs
    Xiang Xianbo
    Liu Chao
    Jouvencel, Bruno
    PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE, 2010, : 3785 - 3790
  • [35] Robust Nonlinear Path Following Control of Underactuated MSV With Time-Varying Sideslip Compensation in the Presence of Actuator Saturation and Error Constraint
    Nie, Jun
    Lin, Xiaogong
    IEEE ACCESS, 2018, 6 : 71906 - 71917
  • [36] Global path following control for underactuated stratospheric airship
    Zheng, Zewei
    Wu, Zhe
    ADVANCES IN SPACE RESEARCH, 2013, 52 (07) : 1384 - 1395
  • [37] Spatial Curvilinear Path Following Control of Underactuated AUV
    Miao J.-M.
    Wang S.-P.
    Fan L.
    Li Y.
    Wang, Shao-Ping (shaopingwang@vip.sina.com), 1786, China Ordnance Industry Corporation (38): : 1786 - 1796
  • [38] Path following Control of an Underactuated Catamaran for Recovery Maneuvers
    Lee, Sang-Do
    Song, Yong-Seung
    Kim, Dae-Hae
    Kang, Ma-Ru
    SENSORS, 2022, 22 (06)
  • [39] Spatial Path Following Control of an Autonomous Underactuated Airship
    Zhou, Wei-Xiang
    Xiao, Chang
    Zhou, Ping-Fang
    Duan, Deng-Ping
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2019, 17 (07) : 1726 - 1737
  • [40] Nonlinear control of decarbonization path following underactuated ships
    Zhao, Hongbiao
    Gao, Xiaowei
    Zhang, Yujia
    Zhang, Xianku
    OCEAN ENGINEERING, 2023, 272