Eavesdropping Game Based on Multi-Agent Deep Reinforcement Learning

被引:0
作者
Guo, Delin [1 ]
Tang, Lan [1 ]
Yang, Lvxi [2 ]
Liang, Ying-Chang [2 ]
机构
[1] Nanjing Univ, Nanjing, Peoples R China
[2] Southeast Univ, Nanjing, Peoples R China
来源
2022 IEEE 23RD INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATION (SPAWC) | 2022年
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Physical layer security; proactive eavesdropping; stochastic game; multi-agent reinforcement learning; WIRETAP CHANNEL;
D O I
10.1109/SPAWC51304.2022.9833927
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper considers an adversarial scenario between a legitimate eavesdropper and a suspicious communication pair. All three nodes are equipped with multiple antennas. The eavesdropper, which operates in a full-duplex model, aims to wiretap the dubious communication pair via proactive jamming. On the other hand, the suspicious transmitter, which can send artificial noise (AN) to disturb the wiretap channel, aims to guarantee secrecy. More specifically, the eavesdropper adjusts jamming power to enhance the wiretap rate, while the suspicious transmitter jointly adapts the transmit power and noise power against the eavesdropping. Considering the partial observation and complicated interactions between the eavesdropper and the suspicious pair in unknown system dynamics, we model the problem as an imperfect-information stochastic game. To approach the Nash equilibrium solution of the eavesdropping game, we develop a multi-agent reinforcement learning (MARL) algorithm, termed neural fictitious self-play with soft actor-critic (NFSP-SAC), by combining the fictitious self-play (FSP) with a deep reinforcement learning algorithm, SAC. The introduction of SAC enables FSP to handle the problems with continuous and high dimension observation and action space. The simulation results demonstrate that the power allocation policies learned by our method empirically converge to a Nash equilibrium, while the compared reinforcement learning algorithms suffer from severe fluctuations during the learning process.
引用
收藏
页数:5
相关论文
共 14 条
  • [1] A comprehensive survey of multiagent reinforcement learning
    Busoniu, Lucian
    Babuska, Robert
    De Schutter, Bart
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2008, 38 (02): : 156 - 172
  • [2] Guaranteeing secrecy using artificial noise
    Goel, Satashu
    Negi, Rohit
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2008, 7 (06) : 2180 - 2189
  • [3] Haarnoja T, 2018, PR MACH LEARN RES, V80
  • [4] Faster Learning and Adaptation in Security Games by Exploiting Information Asymmetry
    He, Xiaofan
    Dai, Huaiyu
    Ning, Peng
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2016, 64 (13) : 3429 - 3443
  • [5] Heinrich J, 2016, Arxiv, DOI arXiv:1603.01121
  • [6] Heinrich J, 2015, PR MACH LEARN RES, V37, P805
  • [7] Nash Q-learning for general-sum stochastic games
    Hu, JL
    Wellman, MP
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (06) : 1039 - 1069
  • [8] Wiretap Channel With Full-Duplex Proactive Eavesdropper: A Game Theoretic Approach
    Huang, Wei
    Chen, Wei
    Bai, Bo
    Han, Zhu
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2018, 67 (08) : 7658 - 7663
  • [9] Jamming Games in the MIMO Wiretap Channel With an Active Eavesdropper
    Mukherjee, Amitav
    Swindlehurst, A. Lee
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2013, 61 (01) : 82 - 91
  • [10] Ratcliffe Dino Stephen, 2019, 2019 IEEE C GAMES CO, P1