Reinforcement Learning for Energy-Efficient Trajectory Design of UAVs

被引:26
作者
Arani, Atefeh Hajijamali [1 ]
Azari, M. Mahdi [2 ]
Hu, Peng [1 ,3 ]
Zhu, Yeying [1 ]
Yanikomeroglu, Halim [4 ]
Safavi-Naeini, Safieddin [5 ]
机构
[1] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON N2L 3G1, Canada
[2] Univ Luxembourg, SnT, L-4365 Esch Sur Alzette, Luxembourg
[3] Natl Res Council Canada, Digital Technol Res Ctr, Waterloo, ON K1A 0R6, Canada
[4] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON K1S 5B6, Canada
[5] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Trajectory; Three-dimensional displays; Unmanned aerial vehicles; Energy consumption; Throughput; Reinforcement learning; Propagation losses; Energy efficiency; multiarmed bandit (MAB); reinforcement learning; unmanned aerial vehicles (UAVs); COMMUNICATION; NETWORKS; DEPLOYMENT; PLACEMENT; COVERAGE;
D O I
10.1109/JIOT.2021.3118322
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Integrating unmanned aerial vehicles (UAVs) as aerial base stations (BSs) into terrestrial cellular networks has emerged as an effective solution to provide coverage and complement communication services in a fast and cost-effective manner. The three-dimensional (3-D) trajectories of UAVs have a remarkable impact on the performance of such networks. On the other hand, UAVs are battery limited, and thus optimizing their energy consumption is of high importance. In this regard, we propose a novel trajectory design mechanism for rotary-wing UAV-BSs in 3-D space to improve the energy efficiency of the network. In this approach, UAVs aim at maximizing an objective function that captures the tradeoff between energy consumption and throughput, while satisfying their ground users' quality-of-service requirements. Using reinforcement learning, we model our problem as a multiarmed bandit and propose an upper confidence bound-based algorithm to solve the problem. In our proposed mechanism, UAVs autonomously choose their velocities and update their locations adapted to the system conditions without requiring the prior and full knowledge of the system. Simulation results show that our proposed approach yields significant performance gains reaching up to 33.85% in terms of improving the network throughput, and up to 95% of enhancing the energy efficiency compared to a learning-based benchmark. Comparing to a nonlearning-based approach, our proposed approach improves the throughput and energy efficiency by 46.61% and 110%, respectively.
引用
收藏
页码:9060 / 9070
页数:11
相关论文
共 47 条
[1]   A Reinforcement Learning Approach for Fair User Coverage Using UAV Mounted Base Stations Under Energy Constraints [J].
Abeywickrama, Hasini Viranga ;
He, Ying ;
Dutkiewicz, Eryk ;
Jayawickrama, Beeshanga Abewardana ;
Mueck, Markus .
IEEE OPEN JOURNAL OF VEHICULAR TECHNOLOGY, 2020, 1 :67-81
[2]   Sleeping Multi-Armed Bandit Learning for Fast Uplink Grant Allocation in Machine Type Communications [J].
Ali, Samad ;
Ferdowsi, Aidin ;
Saad, Walid ;
Rajatheva, Nandana ;
Haapola, Jussi .
IEEE TRANSACTIONS ON COMMUNICATIONS, 2020, 68 (08) :5072-5086
[3]   3-D Placement of an Unmanned Aerial Vehicle Base Station for Maximum Coverage of Users With Different QoS Requirements [J].
Alzenad, Mohamed ;
El-Keyi, Amr ;
Yanikomeroglu, Halim .
IEEE WIRELESS COMMUNICATIONS LETTERS, 2018, 7 (01) :38-41
[4]  
Arani A. H., 2021, P IEEE INT C COMM, P1
[5]   Fairness-Aware Link Optimization for Space-Terrestrial Integrated Networks: A Reinforcement Learning Framework [J].
Arani, Atefeh Hajijamali ;
Hu, Peng ;
Zhu, Yeying .
IEEE ACCESS, 2021, 9 :77624-77636
[6]  
Arani AH, 2020, UEEE INT SYM PERS IN
[7]   Minimizing Base Stations' ON/OFF Switchings in Self-Organizing Heterogeneous Networks: A Distributed Satisfactory Framework [J].
Arani, Atefeh Hajijamali ;
Omidi, Mohammad Javad ;
Mehbodniya, Abolfazl ;
Adachi, Fumiyuki .
IEEE ACCESS, 2017, 5 :26267-26278
[8]   Distributed Learning for Energy-Efficient Resource Management in Self-Organizing Heterogeneous Networks [J].
Arani, Atefeh Hajijamali ;
Mehbodniya, Abolfazl ;
Omidi, Mohammad Javad ;
Adachi, Fumiyuki ;
Saad, Walid ;
Guvenc, Ismail .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2017, 66 (10) :9287-9303
[9]  
Azari A. H., 2020, PROC IEEE GLOBECOM W, P1
[10]  
Azari M. M., 2021, ARXIV210706881, P2021