Multi-objective Game Learning Algorithm Based on Multi-armed Bandit in Underwater Acoustic Communication Networks

被引:1
作者
Wang, Hui [1 ]
Yang, Liejun [2 ,3 ]
机构
[1] Minnan Normal Univ, Sch Phys & Informat Engn, 36, Xianqianzhi St, Zhangzhou 363000, Peoples R China
[2] Ningde Normal Univ, Sch Informat & Mech & Elect Engn, 1, Coll Rd, Ningde 352000, Peoples R China
[3] Fujian Prov Univ, Ningde Normal Univ, Key Lab Intelligent Ecotourism & Leisure Agr, Ningde 352100, Peoples R China
基金
中国国家自然科学基金;
关键词
underwater acoustic communication; reinforcement learning; power allocation; multi-armed bandit; POWER ALLOCATION; PROTOCOL;
D O I
10.18494/SAM4305
中图分类号
TH7 [仪器、仪表];
学科分类号
0804 ; 080401 ; 081102 ;
摘要
To address the challenges of interference in underwater multi-node communication and enhance the efficiency of underwater acoustic communication, we propose a multi-objective game learning algorithm based on the multi-armed bandit framework. Firstly, the multi-objective optimization problem is constructed as a multi-node multi-armed bandit (MAB) game model. Secondly, we incorporate the overall network interference level and nodes' power cost in the utility function to achieve the desired optimization objectives. Thirdly, we establish the existence and uniqueness of the Nash equilibrium point of the game model and introduce an improved greedy strategy MAB learning algorithm to determine the equilibrium solution. Finally, our simulation results demonstrate that the proposed algorithm effectively optimizes interference management while enhancing the nodes' adaptive capabilities.
引用
收藏
页码:1619 / 1630
页数:12
相关论文
共 25 条