Dynamic Spectrum Anti-Jamming With Reinforcement Learning Based on Value Function Approximation

被引:9
作者
Zhu, Xinyu [1 ]
Huang, Yang [1 ]
Wang, Shaoyu [1 ]
Wu, Qihui [1 ]
Ge, Xiaohu [2 ]
Liu, Yuan [3 ]
Gao, Zhen [4 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Key Lab Dynam Cognit Syst Electromagnet Spectrum S, Minist Ind & Informat Technol, Nanjing 210016, Peoples R China
[2] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Peoples R China
[3] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Peoples R China
[4] Beijing Inst Technol, Sch Informat & Elect, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Jamming; Internet of Things; Wireless networks; Time-frequency analysis; Interference; Decision making; Channel estimation; Uplink transmissions; anti-jamming; Markov decision process; reinforcement learning; ALGORITHM;
D O I
10.1109/LWC.2022.3228045
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This letter addresses the spectrum anti-jamming problem with multiple Internet of Things (IoT) devices for uplink transmissions, where policies for configuring frequency-domain channels have to be learned without the knowledge of the time-frequency distribution of the interference. The problem of decision-making or learning is expected to be solved by reinforcement learning (RL) approaches. However, the state-of-the-art RL-based spectrum anti-jamming methods may not be applicable in IoT systems, suffer from high computational complexity or may converge to a policy that may not be the best for each user. Therefore, we propose a novel spectrum anti-jamming scheme where configuration policies for the IoT devices are sequentially optimized with value function approximation-based multi-agent RL. Simulation results show that our proposed algorithm outperforms various baselines in terms of average normalized throughput.
引用
收藏
页码:386 / 390
页数:5
相关论文
共 18 条
[1]  
Aref MA, 2017, IEEE WCNC
[2]   Deep Reinforcement Learning A brief survey [J].
Arulkumaran, Kai ;
Deisenroth, Marc Peter ;
Brundage, Miles ;
Bharath, Anil Anthony .
IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38
[3]   A comprehensive survey of multiagent reinforcement learning [J].
Busoniu, Lucian ;
Babuska, Robert ;
De Schutter, Bart .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2008, 38 (02) :156-172
[4]  
Chen YF, 2005, IEEE COMMUN LETT, V9, P688, DOI [10.1109/LCOMM.2005.08002, 10.1109/LCOMM.2005.1496583]
[5]   The kernel recursive least-squares algorithm [J].
Engel, Y ;
Mannor, S ;
Meir, R .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (08) :2275-2285
[6]   Dynamic Resource Configuration for Low-Power IoT Networks: A Multi-Objective Reinforcement Learning Method [J].
Huang, Yang ;
Hao, Caiyong ;
Mao, Yijie ;
Zhou, Fuhui .
IEEE COMMUNICATIONS LETTERS, 2021, 25 (07) :2285-2289
[7]   An In-Depth Analysis of IoT Security Requirements, Challenges, and Their Countermeasures via Software-Defined Security [J].
Iqbal, Waseem ;
Abbas, Haider ;
Daneshmand, Mahmoud ;
Rauf, Bilal ;
Bangash, Yawar Abbas .
IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (10) :10250-10276
[8]   Game-Theoretic Learning Anti-Jamming Approaches in Wireless Networks [J].
Jia, Luliang ;
Qi, Nan ;
Chu, Feihuang ;
Fang, Shengliang ;
Wang, Ximing ;
Ma, Shuli ;
Feng, Shuo .
IEEE COMMUNICATIONS MAGAZINE, 2022, 60 (05) :60-66
[9]   STACKELBERG GAME APPROACHES FOR ANTI-JAMMING DEFENCE IN WIRELESS NETWORKS [J].
Jia, Luliang ;
Xu, Yuhua ;
Sun, Youming ;
Feng, Shuo ;
Anpalagan, Alagan .
IEEE WIRELESS COMMUNICATIONS, 2018, 25 (06) :120-128
[10]   Dynamic Spectrum Anti-Jamming in Broadband Communications: A Hierarchical Deep Reinforcement Learning Approach [J].
Li, Yangyang ;
Xu, Yuhua ;
Xu, Yitao ;
Liu, Xin ;
Wang, Ximing ;
Li, Wen ;
Anpalagan, Alagan .
IEEE WIRELESS COMMUNICATIONS LETTERS, 2020, 9 (10) :1616-1619