Drone-Cell Trajectory Planning and Resource Allocation for Highly Mobile Networks: A Hierarchical DRL Approach

被引：59

作者：

Shi, Weisen ^{[1
]}

Li, Junling ^{[1
,2
]}

Wu, Huaqing ^{[1
]}

Zhou, Conghao ^{[1
]}

Cheng, Nan ^{[3
]}

Shen, Xuemin ^{[1
]}

机构：

[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada

[2] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Guangdong, Peoples R China

[3] Xidian Univ, Sch Telecommun, Xian 710071, Peoples R China

来源：

IEEE INTERNET OF THINGS JOURNAL | 2021年 / 8卷 / 12期

基金：

中国国家自然科学基金; 加拿大自然科学与工程研究理事会;

关键词：

Trajectory; Planning; Resource management; Throughput; Internet of Things; Radio access networks; Real-time systems; Drone cell; drone-assisted radio access network (RAN); space-air-ground integration; trajectory planning; VEHICULAR NETWORKS; DESIGN; 5G; UAVS;

D O I：

10.1109/JIOT.2020.3020067

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Drone cell (DC) is envisioned to enable the dynamic service provisioning for radio access networks (RANs), in response to the spatial and temporal unevenness of user traffic. In this article, we propose a hierarchical deep reinforcement learning (DRL)-based multi-DC trajectory planning and resource allocation (HDRLTPRA) scheme for high-mobility users. The objective is to maximize the accumulative network throughput while satisfying user fairness, DC power consumption, and DC-to-ground link quality constraints. To address the high uncertainties of the environment, we decouple the multi-DC TPRA problem into two hierarchical subproblems, i.e., the higher level global trajectory planning (GTP) subproblem and the lower level local TPRA (LTPRA) subproblem. First, the GTP subproblem is to address trajectory planning for multiple DCs in the RAN over a long time period. To solve the subproblem, we propose a multiagent DRL-based GTP (MARL-GTP) algorithm in which the nonstationary state space caused by the multi-DC environment is addressed by the multiagent fingerprint technique. Second, based on the GTP results, each DC solves the LTPRA subproblem independently to control the movement and transmit power allocation based on the real-time user traffic variations. A deep deterministic policy gradient (DEP)-based LTPRA (DEP-LTPRA) algorithm is then proposed to solve the LTPRA subproblem. With the two algorithms addressing both subproblems at different decision granularities, the multi-DC TPRA problem can be resolved by the HDRLTPRA scheme. Simulation results show that 40% network throughput improvement can be achieved by the proposed HDRLTPRA scheme over the nonlearning-based TPRA scheme.

引用

页码：9800 / 9813

页数：14

共 36 条

[1] Modeling Cellular-to-UAV Path-Loss for Suburban Environments [J].

Al-Hourani, Akram ;

Gomez, Karina .

IEEE WIRELESS COMMUNICATIONS LETTERS, 2018, 7 (01) :82-85

[2] Optimal LAP Altitude for Maximum Coverage [J].

Al-Hourani, Akram ;

Kandeepan, Sithamparanathan ;

Lardner, Simon .

IEEE WIRELESS COMMUNICATIONS LETTERS, 2014, 3 (06) :569-572

[3]

[Anonymous], 2018, 36777 3GPP TR SOPH A

[4]

Challita U., 2018, CELLULAR CONNECTED U

[5] Echo-Liquid State Deep Learning for 360° Content Transmission and Caching in Wireless VR Networks With Cellular-Connected UAVs [J].

Chen, Mingzhe ;

Saad, Walid ;

Yin, Changchuan .

IEEE TRANSACTIONS ON COMMUNICATIONS, 2019, 67 (09) :6386-6400

[6] A Comprehensive Simulation Platform for Space-Air-Ground Integrated Network [J].

Cheng, Nan ;

Quan, Wei ;

Shi, Weisen ;

Wu, Huaqing ;

Ye, Qiang ;

Zhou, Haibo ;

Zhuang, Weihua ;

Shen, Xuemin ;

Bai, Bo .

IEEE WIRELESS COMMUNICATIONS, 2020, 27 (01) :178-185

[7] Space/Aerial-Assisted Computing Offloading for IoT Applications: A Learning-Based Approach [J].

Cheng, Nan ;

Lyu, Feng ;

Quan, Wei ;

Zhou, Conghao ;

He, Hongli ;

Shi, Weisen ;

Shen, Xuemin .

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2019, 37 (05) :1117-1129

[8] Cell Tower Extension through Drones [J].

Dhekne, Ashutosh ;

Gowda, Mahanth ;

Choudhury, Romit Roy .

MOBICOM'16: PROCEEDINGS OF THE 22ND ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2016, :456-457

[9]

Foerster J, 2017, PR MACH LEARN RES, V70

[10] Reinforcement Learning for Decentralized Trajectory Design in Cellular UAV Networks With Sense-and-Send Protocol [J].

Hu, Jingzhi ;

Zhang, Hongliang ;

Song, Lingyang .

IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (04) :6177-6189

← 1 2 3 4 →