Generative adversarial interactive imitation learning for path following of autonomous underwater vehicle

被引:10
|
作者
Jiang, Dong [1 ]
Huang, Jie [1 ]
Fang, Zheng [1 ]
Cheng, Chunxi [1 ]
Sha, Qixin [1 ]
He, Bo [1 ]
Li, Guangliang [1 ]
机构
[1] Ocean Univ China, Coll Elect Engn, Qingdao, Peoples R China
关键词
Deep reinforcement learning; Autonomous control; Autonomous underwater vehicle; Imitation learning; Interactive reinforcement learning; LEVEL CONTROL; PID CONTROL; DEEP;
D O I
10.1016/j.oceaneng.2022.111971
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
Autonomous underwater vehicle (AUV) is playing a more and more important role in marine scientific research and resource exploration due to its flexibility. Recently, deep reinforcement learning (DRL) has been used to improve the autonomy of AUV. However, it is very time-consuming and even unpractical to define efficient reward functions for DRL to learn control policies in various tasks. In this paper, we implemented the generative adversarial imitation learning (GAIL) algorithm learning from demonstrated trajectories and proposed GA2IL learning from demonstrations and additional human rewards for AUV path following. We evaluated GAIL and our GA2IL method in a straight line following task and a sinusoids curve following task on the Gazebo platform extended to simulated underwater environments with AUV simulator of our lab. Both methods were compared to PPO-a classic traditional deep reinforcement learning from a predefined reward function, and a well-tuned PID controller. In addition, to evaluate the generalization of GAIL and our GA2IL method, we tested the trained control policies of the previous two tasks via GAIL and GA2IL in a new complex comb scan following task and a different sinusoids curve following task respectively. Our simulation results show AUV path following with GA2IL and GAIL can obtain a performance at a similar level to PPO and PID controller in both tasks. Moreover, GA2IL can generalize as well as PPO, adapting better to complex and different tasks than traditional PID controller.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] PATH FOLLOWING CONTROL OF FULLY ACTUATED AUTONOMOUS UNDERWATER VEHICLE BASED ON LADRC
    Lamraoui, Habib Choukri
    Zhu Qidan
    POLISH MARITIME RESEARCH, 2018, 25 (04) : 39 - 48
  • [42] Learning Temporal Strategic Relationships using Generative Adversarial Imitation Learning
    Fernando, Tharindu
    Denman, Simon
    Sridharan, Sridha
    Fookes, Clinton
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 113 - 121
  • [43] Path Tracking Control for an Autonomous Underwater Vehicle
    Hernandez, Ruben D.
    Falchetto, Vinicius B.
    Ferreira, Janito V.
    2015 WORKSHOP ON ENGINEERING APPLICATIONS - INTERNATIONAL CONGRESS ON ENGINEERING (WEA), 2015,
  • [44] Informative Path Planning for an Autonomous Underwater Vehicle
    Binney, Jonathan
    Krause, Andreas
    Sukhatme, Gaurav S.
    2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, : 4791 - 4796
  • [45] THREE-DIMENSIONAL PATH-FOLLOWING CONTROL OF AN AUTONOMOUS UNDERWATER VEHICLE BASED ON DEEP REINFORCEMENT LEARNING
    Liang, Zhenyu
    Qu, Xingru
    Zhang, Zhao
    Chen, Cong
    POLISH MARITIME RESEARCH, 2022, 29 (04) : 36 - 44
  • [46] Multi-hop path reasoning of temporal knowledge graphs based on generative adversarial imitation learning
    Bai, Luyi
    Xiao, Qianwen
    Zhu, Lin
    KNOWLEDGE-BASED SYSTEMS, 2025, 316
  • [47] Deep reinforcement learning for adaptive path planning and control of an autonomous underwater vehicle
    Hadi, Behnaz
    Khosravi, Alireza
    Sarhadi, Pouria
    APPLIED OCEAN RESEARCH, 2022, 129
  • [48] GAILPG: Multiagent Policy Gradient With Generative Adversarial Imitation Learning
    Li, Wei
    Huang, Shiyi
    Qiu, Ziming
    Song, Aiguo
    IEEE TRANSACTIONS ON GAMES, 2025, 17 (01) : 62 - 75
  • [49] GACS: Generative Adversarial Imitation Learning Based on Control Sharing
    Huaiwei SI
    Guozhen TAN
    Dongyu LI
    Yanfei PENG
    JournalofSystemsScienceandInformation, 2023, 11 (01) : 78 - 93
  • [50] Generative Adversarial Network for Imitation Learning from Single Demonstration
    Tho Nguyen Duc
    Chanh Minh Tran
    Phan Xuan Tan
    Kamioka, Eiji
    BAGHDAD SCIENCE JOURNAL, 2021, 18 (04) : 1350 - 1355