Multi-Agent Guided Deep Reinforcement Learning Approach Against State Perturbed Adversarial Attacks

被引:0
作者
Cerci, Cagri [1 ]
Temeltas, Hakan [2 ]
机构
[1] Istanbul Tech Univ, Dept Mechatron Engn, TR-34467 Maslak, Istanbul, Turkiye
[2] Istanbul Tech Univ, Dept Control & Automat Engn, TR-34467 Maslak, Istanbul, Turkiye
关键词
Training; Robustness; Mathematical models; Neural networks; Data models; Noise measurement; Perturbation methods; Autonomous vehicles; Multi-agent systems; Heuristic algorithms; Reinforcement learning; Adversarial attack; guided policy search; multi-agent reinforcement learning; encirclement;
D O I
10.1109/ACCESS.2024.3485036
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning (DRL) algorithms interact with the environment and aim to learn without labeled data. In high-dimensional spaces, they evolve their policies to maximize the rewards they can collect. They have applications in various fields, such as search and rescue, reconnaissance, military operations, firefighting, and autonomous vehicles. However, there are also situations in which algorithms struggle to cope. In simulation environments, it is assumed that the exact values of the observation data are properly received. If a neural network model meets inputs that are different from those used during training, accurate predictions cannot be made to solve these new situations. This makes it vulnerable to corrupted state data which may be encountered in real-world applications. In this study, State Adversarial Markov Decision Process (SA-MDP) was investigated to increase robustness. The state perturbed adversarial attack model is integrated into the DRL algorithm. To make appropriate decisions under perturbation, the guide actor, which is used only in the training phase and makes decisions with healthy observation data, guides the control actor, which makes decisions based on the perturbation model outputs. The proposed algorithm was applied to the target encirclement task for 3, 5 and 7 agents in multi-agent simulation systems prepared using the Pyglet library. The proposed guided approach was applied to both multi-agent soft actor critic (MA-SAC) and multi-agent twin delayed deep deterministic policy gradient (MA-TD3) algorithms. The results show that our approach is close to the results of the MA-SAC and MA-TD3 algorithms trained in noise-free environments.
引用
收藏
页码:156146 / 156159
页数:14
相关论文
共 38 条
[1]  
Belaire Roman, 2024, P 23 INT C AUTONOMOU, P2633
[2]  
Fujimoto S, 2018, Arxiv, DOI arXiv:1802.09477
[3]  
Guo J., 2022, arXiv
[4]  
Haarnoja T, 2018, Arxiv, DOI [arXiv:1801.01290, DOI 10.48550/ARXIV.1801.01290]
[5]  
Haarnoja T, 2019, Arxiv, DOI arXiv:1812.05905
[6]  
Hafez AT, 2013, P AMER CONTR CONF, P3147
[7]   Guided Soft Actor Critic: A Guided Deep Reinforcement Learning Approach for Partially Observable Markov Decision Processes [J].
Haklidir, Mehmet ;
Temeltas, Hakan .
IEEE ACCESS, 2021, 9 :159672-159683
[8]   Robust Decision Making for Autonomous Vehicles at Highway On-Ramps: A Constrained Adversarial Reinforcement Learning Approach [J].
He, Xiangkun ;
Lou, Baichuan ;
Yang, Haohan ;
Lv, Chen .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (04) :4103-4113
[9]  
Henderson P, 2018, AAAI CONF ARTIF INTE, P3207
[10]   Dynamic Encirclement for Anonymous Agents With Arbitrary Formations [J].
Huang, Na ;
Zhuge, Jiaqiang ;
Sun, Zhiyong ;
Huang, Di ;
Kong, Yaguang ;
Lu, Qiang .
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2024, 11 (03) :1644-1654