Proximal Policy Optimization-Based Hierarchical Decision-Making Mechanism for Resource Allocation Optimization in UAV Networks

被引:0
作者
Sun, Kun [1 ]
Yang, Jianyong [1 ]
Li, Jinglei [2 ]
Yang, Bo [2 ]
Ding, Shuman [2 ]
机构
[1] 54th Res Inst China Elect Technol Grp Corp, Shijiazhuang 050000, Peoples R China
[2] Xidian Univ, Sch Telecommun Engn, Xian 710071, Peoples R China
来源
ELECTRONICS | 2025年 / 14卷 / 04期
关键词
unmanned air vehicle (UAV); proximal policy optimization; spectrum allocation;
D O I
10.3390/electronics14040747
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To address the resource allocation problem in dynamic environments where multiple unmanned aerial vehicle base stations (UAV-BSs) provide efficient downlink services to ground users, this paper proposes a novel hierarchical decision-making mechanism based on the Proximal Policy Optimization (PPO) algorithm. The proposed method optimizes time-frequency resource allocation in the downlink, aiming to maximize the total user throughput over multiple time slots. By constructing channel and interference models, the complex multi-channel resource allocation problem is decomposed into a series of single-channel decision subproblems, significantly reducing the action space complexity. Specifically, the original exponential complexity O(NM) (where N is the number of users and M is the number of channels) is reduced to a linear complexity O(N), effectively alleviating the curse of dimensionality. Simulation results demonstrate that the proposed hierarchical architecture, integrated with the PPO algorithm, achieves superior performance in terms of total throughput, convergence speed, and stability compared to existing methods. This study provides new insights and technical support for efficient resource management in UAV-BS systems operating in complex and dynamic environments.
引用
收藏
页数:20
相关论文
共 23 条
[1]   A Survey on Machine-Learning Techniques for UAV-Based Communications [J].
Bithas, Petros S. ;
Michailidis, Emmanouel T. ;
Nomikos, Nikolaos ;
Vouyioukas, Demosthenes ;
Kanatas, Athanasios G. .
SENSORS, 2019, 19 (23)
[2]   A Game-Theoretical Anti-Jamming Scheme for Cognitive Radio Networks [J].
Chen, Changlong ;
Song, Min ;
Xin, ChunSheng ;
Backens, Jonathan .
IEEE NETWORK, 2013, 27 (03) :22-27
[3]   Data Collection Mechanism for UAV-Assisted Cellular Network Based on PPO [J].
Chen, Tuo ;
Dong, Feihong ;
Ye, Hu ;
Wang, Yun ;
Wu, Bin .
ELECTRONICS, 2023, 12 (06)
[4]  
Deb S, 2018, Arxiv, DOI arXiv:1808.03881
[5]   Age of Information Minimization Using Multi-Agent UAVs Based on AI-Enhanced Mean Field Resource Allocation [J].
Emami, Yousef ;
Gao, Hao ;
Li, Kai ;
Almeida, Luis ;
Tovar, Eduardo ;
Han, Zhu .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (09) :13368-13380
[6]   Q-Learning-Based Power Control for LTE Enterprise Femtocell Networks [J].
Gao, Zhibin ;
Wen, Bin ;
Huang, Lianfen ;
Chen, Canbin ;
Su, Ziwen .
IEEE SYSTEMS JOURNAL, 2017, 11 (04) :2699-2707
[7]   A survey on UAV-assisted wireless communications: Recent advances and future trends [J].
Gu, Xiaohui ;
Zhang, Guoan .
COMPUTER COMMUNICATIONS, 2023, 208 :44-78
[8]   A Survey on Spectrum Management for Unmanned Aerial Vehicles (UAVs) [J].
Jasim, Mohammed A. ;
Shakhatreh, Hazim ;
Siasi, Nazli ;
Sawalmeh, Ahmad H. ;
Aldalbahi, Adel ;
Al-Fuqaha, Ala .
IEEE ACCESS, 2022, 10 :11443-11499
[9]   Distributed Heuristically Accelerated Q-Learning for Robust Cognitive Spectrum Management in LTE Cellular Systems [J].
Morozs, Nils ;
Clarke, Tim ;
Grace, David .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2016, 15 (04) :817-825
[10]   Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access [J].
Naparstek, Oshri ;
Cohen, Kobi .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2019, 18 (01) :310-323