Drone-Cell Trajectory Planning and Resource Allocation for Highly Mobile Networks: A Hierarchical DRL Approach

被引：53

作者：

Shi, Weisen ^{[1
]}

Li, Junling ^{[1
,2
]}

Wu, Huaqing ^{[1
]}

Zhou, Conghao ^{[1
]}

Cheng, Nan ^{[3
]}

Shen, Xuemin ^{[1
]}

机构：

[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada

[2] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Guangdong, Peoples R China

[3] Xidian Univ, Sch Telecommun, Xian 710071, Peoples R China

来源：

IEEE INTERNET OF THINGS JOURNAL | 2021年 / 8卷 / 12期

基金：

加拿大自然科学与工程研究理事会; 中国国家自然科学基金;

关键词：

Trajectory; Planning; Resource management; Throughput; Internet of Things; Radio access networks; Real-time systems; Drone cell; drone-assisted radio access network (RAN); space-air-ground integration; trajectory planning; VEHICULAR NETWORKS; DESIGN; 5G; UAVS;

D O I：

10.1109/JIOT.2020.3020067

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Drone cell (DC) is envisioned to enable the dynamic service provisioning for radio access networks (RANs), in response to the spatial and temporal unevenness of user traffic. In this article, we propose a hierarchical deep reinforcement learning (DRL)-based multi-DC trajectory planning and resource allocation (HDRLTPRA) scheme for high-mobility users. The objective is to maximize the accumulative network throughput while satisfying user fairness, DC power consumption, and DC-to-ground link quality constraints. To address the high uncertainties of the environment, we decouple the multi-DC TPRA problem into two hierarchical subproblems, i.e., the higher level global trajectory planning (GTP) subproblem and the lower level local TPRA (LTPRA) subproblem. First, the GTP subproblem is to address trajectory planning for multiple DCs in the RAN over a long time period. To solve the subproblem, we propose a multiagent DRL-based GTP (MARL-GTP) algorithm in which the nonstationary state space caused by the multi-DC environment is addressed by the multiagent fingerprint technique. Second, based on the GTP results, each DC solves the LTPRA subproblem independently to control the movement and transmit power allocation based on the real-time user traffic variations. A deep deterministic policy gradient (DEP)-based LTPRA (DEP-LTPRA) algorithm is then proposed to solve the LTPRA subproblem. With the two algorithms addressing both subproblems at different decision granularities, the multi-DC TPRA problem can be resolved by the HDRLTPRA scheme. Simulation results show that 40% network throughput improvement can be achieved by the proposed HDRLTPRA scheme over the nonlearning-based TPRA scheme.

引用

页码：9800 / 9813

页数：14

共 23 条

[21] Toward Optimal Resource Allocation: A Multi-Agent DRL Based Task Offloading Approach in Multi-UAV-Assisted MEC Networks
Tariq, Muhammad Naqqash
Wang, Jingyu
Raza, Salman
Siraj, Mohammad
Altamimi, Majid
Memon, Saifullah
IEEE ACCESS, 2024, 12 : 81428 - 81440
[22] A Hybrid Secure Resource Allocation and Trajectory Optimization Approach for Mobile Edge Computing Using Federated Learning Based on WEB 3.0
Consul, Prakhar
Budhiraja, Ishan
Garg, Deepak
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 1167 - 1179
[23] Distributed Resource Allocation for Self-Organizing Small Cell Networks: An Evolutionary Game Approach
Semasinghe, Prabodini
Zhu, Kun
Hossain, Ekram
2013 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2013, : 702 - 707

← 1 2 3 →