Optimal Frequency Reuse and Power Control in Multi-UAV Wireless Networks: Hierarchical Multi-Agent Reinforcement Learning Perspective

被引：10

作者：

Lee, Seungmin ^{[1
,2
]}

Lim, Suhyeon ^{[1
,2
]}

Chae, Seong Ho ^{[3
]}

Jung, Bang Chul ^{[4
]}

Park, Chan Yi ^{[5
]}

Lee, Howon ^{[1
,2
]}

机构：

[1] Hankyong Natl Univ, Sch Elect & Elect Engn, Anseong 17579, South Korea

[2] Hankyong Natl Univ, Inst IT Convergence IITC, Anseong 17579, South Korea

[3] Tech Univ Korea, Dept Elect Engn, Siheung Si 15073, South Korea

[4] Chungnam Natl Univ, Dept Elect Engn, Daejeon 34134, South Korea

[5] Agcy Def Dev, Daejeon 34186, South Korea

来源：

IEEE ACCESS | 2022年 / 10卷

关键词：

Frequency conversion; Computer architecture; Time-frequency analysis; Microprocessors; Wireless networks; Q-learning; Autonomous aerial vehicles; Unmanned aerial vehicle; optimal frequency reuse; transmit power control; energy efficiency; hierarchical multi-agent Q-learning; multi-UAV wireless network; COVERAGE; ACCESS;

D O I：

10.1109/ACCESS.2022.3166179

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To overcome the problems caused by the limited battery lifetime in multiple-unmanned aerial vehicle (UAV) wireless networks, we propose a hierarchical multi-agent reinforcement learning (RL) framework to maximize the energy efficiency (EE) of UAVs by finding the optimal frequency reuse factor and transmit power. The proposed algorithm consists of distributed inner-loop RL for transmit power control of the UAV terminal (UT) and centralized outer-loop RL for finding the optimal frequency reuse factor. Specifically, the proposed algorithm adjusts these two factors jointly to effectively mitigate intercell interference and reduce undesired transmit power consumption in multi-UAV wireless networks. We show that, for this reason, the proposed algorithm outperforms conventional algorithms, such as a random action algorithm with a fixed frequency reuse factor and a hierarchical multi-agent Q-learning algorithm with binary transmit power controls. Furthermore, even in the environment where UTs are continuously moving based on the mixed mobility model, we show that the proposed algorithm can find the best reward when compared to conventional algorithms.

引用

页码：39555 / 39565

页数：11

共 19 条

[1] Optimal LAP Altitude for Maximum Coverage [J].