Dynamic Spectrum Anti-Jamming With Reinforcement Learning Based on Value Function Approximation

被引:9
作者
Zhu, Xinyu [1 ]
Huang, Yang [1 ]
Wang, Shaoyu [1 ]
Wu, Qihui [1 ]
Ge, Xiaohu [2 ]
Liu, Yuan [3 ]
Gao, Zhen [4 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Key Lab Dynam Cognit Syst Electromagnet Spectrum S, Minist Ind & Informat Technol, Nanjing 210016, Peoples R China
[2] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Peoples R China
[3] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Peoples R China
[4] Beijing Inst Technol, Sch Informat & Elect, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Jamming; Internet of Things; Wireless networks; Time-frequency analysis; Interference; Decision making; Channel estimation; Uplink transmissions; anti-jamming; Markov decision process; reinforcement learning; ALGORITHM;
D O I
10.1109/LWC.2022.3228045
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This letter addresses the spectrum anti-jamming problem with multiple Internet of Things (IoT) devices for uplink transmissions, where policies for configuring frequency-domain channels have to be learned without the knowledge of the time-frequency distribution of the interference. The problem of decision-making or learning is expected to be solved by reinforcement learning (RL) approaches. However, the state-of-the-art RL-based spectrum anti-jamming methods may not be applicable in IoT systems, suffer from high computational complexity or may converge to a policy that may not be the best for each user. Therefore, we propose a novel spectrum anti-jamming scheme where configuration policies for the IoT devices are sequentially optimized with value function approximation-based multi-agent RL. Simulation results show that our proposed algorithm outperforms various baselines in terms of average normalized throughput.
引用
收藏
页码:386 / 390
页数:5
相关论文
共 18 条
[11]   Anti-Jamming Communications Using Spectrum Waterfall: A Deep Reinforcement Learning Approach [J].
Liu, Xin ;
Xu, Yuhua ;
Jia, Luliang ;
Wu, Qihui ;
Anpalagan, Alagan .
IEEE COMMUNICATIONS LETTERS, 2018, 22 (05) :998-1001
[12]  
Melo F. S., 2001, Rep.
[13]  
Powell WB, 2007, APPROXIMATE DYNAMIC PROGRAMMING: SOLVING THE CURSES OF DIMENSIONALITY, P1, DOI 10.1002/9780470182963
[14]  
Tsiligkaridis T, 2018, IEEE GLOB CONF SIG, P579, DOI 10.1109/GlobalSIP.2018.8646702
[15]   Dynamic Air-Ground Collaboration for Multi-Access Edge Computing [J].
Wang, Shaoyu ;
Yang Huang ;
Clerckx, Bruno .
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, :5365-5371
[16]   Dynamic Spectrum Anti-Jamming Communications: Challenges and Opportunities [J].
Wang, Ximing ;
Wang, Jinlong ;
Xu, Yuhua ;
Chen, Jin ;
Jia, Luliang ;
Liu, Xin ;
Yang, Yijun .
IEEE COMMUNICATIONS MAGAZINE, 2020, 58 (02) :79-85
[17]   Kernel-based least squares policy iteration for reinforcement learning [J].
Xu, Xin ;
Hu, Dewen ;
Lu, Xicheng .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (04) :973-992
[18]   A Collaborative Multi-Agent Reinforcement Learning Anti-Jamming Algorithm in Wireless Networks [J].
Yao, Fuqiang ;
Jia, Luliang .
IEEE WIRELESS COMMUNICATIONS LETTERS, 2019, 8 (04) :1024-1027