Cooperative defense of autonomous surface vessels with quantity disadvantage using behavior cloning and deep reinforcement learning

被引:3
作者
Sun, Siqing [1 ,2 ]
Li, Tianbo [1 ]
Chen, Xiao [3 ]
Dong, Huachao [1 ,2 ]
Wang, Xinjing [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Marine Sci & Technol, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Aoxiang State Key Lab, Xian 710072, Peoples R China
[3] Air Force Engn Univ, Air & Missile Def Coll, Xian 710051, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous surface vessels; Cooperative defense; Behavior cloning; Deep reinforcement learning; Reward design; MULTIAGENT SYSTEMS; LEVEL; TIME; GAME;
D O I
10.1016/j.asoc.2024.111968
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Autonomous Surface Vessels (ASVs) excel at undertaking hazardous tasks, garnering significant attention recently. Particularly, ASV cooperative defense is a crucial application for protecting harbors and combating smugglers. Here, ASVs intercept intruders from reaching a protected region. Unlike most research, which assumes defenders with numerical advantages, this work considers a more practical defense mission with fewer defenders, defender damages, and intruders employing evasion strategies. However, interception challenges are also introduced, including ASV underactuated dynamics, a limited interception time window, and environmental nonstationarity. Directly applying existing defense methods to such missions may not achieve success. To surmount the challenges, we propose an ASV decision-making framework by integrating supervised learning and deep reinforcement learning. Initially, supervised learning uses actions from a bi-level controller to train ASVs, addressing underactuated dynamics and aiding policy convergence. Subsequently, deep reinforcement learning explores more effective policies to enhance interception rates. Furthermore, hybrid rewards are meticulously designed to drive policy optimizations while mitigating environmental nonstationarity. Finally, numerical simulations are carried out to verify the effectiveness of our approach.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Game Strategies for Physical Robot Soccer Players: A Survey
    Antonioni, Emanuele
    Suriani, Vincenzo
    Riccio, Francesco
    Nardi, Daniele
    [J]. IEEE TRANSACTIONS ON GAMES, 2021, 13 (04) : 342 - 357
  • [2] A meta-heuristic assisted underwater glider path planning method
    Cai, Jinsi
    Zhang, Fubin
    Sun, Siqing
    Li, Tianbo
    [J]. OCEAN ENGINEERING, 2021, 242
  • [3] Surrogate-assisted hierarchical learning water cycle algorithm for high-dimensional expensive optimization
    Chen, Caihua
    Wang, Xinjing
    Dong, Huachao
    Wang, Peng
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2022, 75
  • [4] Equilibrium Strategy of the Pursuit-Evasion Game in Three-Dimensional Space
    Chen, Nuo
    Li, Linjing
    Mao, Wenji
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (02) : 446 - 458
  • [5] Proximal policy optimization guidance algorithm for intercepting near-space maneuvering targets
    Chen, Wenxue
    Gao, Changsheng
    Jing, Wuxing
    [J]. AEROSPACE SCIENCE AND TECHNOLOGY, 2023, 132
  • [6] Dissipativity-based finite-time asynchronous output feedback control for wind turbine system via a hidden Markov model
    Cheng, Peng
    Wang, Hai
    Stojanovic, Vladimir
    Liu, Fei
    He, Shuping
    Shi, Kaibo
    [J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2022, 53 (15) : 3177 - 3189
  • [7] Dynamic Discrete Pigeon-Inspired Optimization for Multi-UAV Cooperative Search-Attack Mission Planning
    Duan, Haibin
    Zhao, Jianxia
    Deng, Yimin
    Shi, Yuhui
    Ding, Xilun
    [J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2021, 57 (01) : 706 - 720
  • [8] Duan Ting, 2022, 2022 8th International Conference on Big Data and Information Analytics (BigDIA), P164, DOI 10.1109/BigDIA56350.2022.9874133
  • [9] Hester T, 2018, AAAI CONF ARTIF INTE, P3223
  • [10] Multiple-Target Surrounding and Collision Avoidance With Second-Order Nonlinear Multiagent Systems
    Hu, Bin-Bin
    Zhang, Hai-Tao
    Wang, Jun
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2021, 68 (08) : 7454 - 7463