Joint path planning and power allocation of a cellular-connected UAV using apprenticeship learning via deep inverse reinforcement learning

被引：5

作者：

Shamsoshoara, Alireza ^{[1
,2
]}

Lotfi, Fatemeh ^{[2
]}

Mousavi, Sajad ^{[2
,3
]}

Afghah, Fatemeh ^{[2
]}

Guevenc, Ismail ^{[2
,4
]}

机构：

[1] No Arizona Univ, Sch Informat Comp & Cyber Syst, Flagstaff, AZ 86011 USA

[2] Clemson Univ, Dept Elect & Comp Engn, Clemson, SC 29634 USA

[3] Harvard Med Sch, Boston, MA USA

[4] North Carolina State Univ, Raleigh, NC USA

来源：

COMPUTER NETWORKS | 2024年 / 254卷

基金：

美国国家科学基金会;

关键词：

Apprenticeship learning; Cellular-connected drones; Inverse reinforcement learning; Path planning; UAV communication; COMMUNICATION; NETWORKS; ALTITUDE;

D O I：

10.1016/j.comnet.2024.110789

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper investigates an interference-aware joint path planning and power allocation mechanism for a cellular-connected unmanned aerial vehicle (UAV) in a sparse suburban environment. The UAV's goal is to fly from an initial point and reach a destination point by moving along the cells to guarantee the required quality of service (QoS). In particular, the UAV aims to maximize its uplink throughput and minimize interference to the ground user equipment (UEs) connected to neighboring cellular base stations (BSs), considering both the shortest path and limitations on flight resources. Expert knowledge is used to experience the scenario and define the desired behavior for the sake of the agent (i.e., UAV) training. To solve the problem, an apprenticeship learning method is utilized via inverse reinforcement learning (IRL) based on both Q-learning and deep reinforcement learning (DRL). The performance of this method is compared to learning from a demonstration technique called behavioral cloning (BC) using a supervised learning approach. Simulation and numerical results show that the proposed approach can achieve expert-level performance. We also demonstrate that, unlike the BC technique, the performance of our proposed approach does not degrade in unseen situations.

引用

页数：20

共 94 条

[1]

3rd Generation Partnership Project (3GPP), 2012, 3GPP TS 36.420 3GPP TSG RAN evolved universal terrestrial radio access network (EUTRAN), X2 general aspects and principles

[2]

3rd Generation Partnership Project (3GPP), 2020, 3GPP TS 22.125 version 16.3.0 release 16, technical specification

[3]

3rd Generation Partnership Project (3GPP), 2022, 3GPP TS 29.122 version 17.5.0 release 17, technical specification

[4]

Abbeel, 2004, P 21 INT C MACH LEAR

[5] A survey of inverse reinforcement learning [J].

Adams, Stephen ;

Cody, Tyler ;

Beling, Peter A. .

ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (06) :4307-4346

[6]

Afghah F, 2019, IEEE CONF COMPUT, P835, DOI [10.1109/INFCOMW.2019.8845309, 10.1109/infcomw.2019.8845309]

[7] Path planning techniques for unmanned aerial vehicles: A review, solutions, and challenges [J].

Aggarwal, Shubhani ;

Kumar, Neeraj .

COMPUTER COMMUNICATIONS, 2020, 149 :270-299

[8]

Al-Hourani A, 2014, IEEE GLOB COMM CONF, P2898, DOI 10.1109/GLOCOM.2014.7037248

[9] Optimal LAP Altitude for Maximum Coverage [J].

Al-Hourani, Akram ;

Kandeepan, Sithamparanathan ;

Lardner, Simon .

IEEE WIRELESS COMMUNICATIONS LETTERS, 2014, 3 (06) :569-572

[10] MEAN SQUARE ERROR OF PREDICTION AS A CRITERION FOR SELECTING VARIABLES [J].

ALLEN, DM .

TECHNOMETRICS, 1971, 13 (03) :469-&

← 1 2 3 4 5 6 7 8 9 10 →