UAV Control Method Combining Reptile Meta-Reinforcement Learning and Generative Adversarial Imitation Learning

被引：2

作者：

Jiang, Shui ^{[1
]}

Ge, Yanning ^{[1
]}

Yang, Xu ^{[2
]}

Yang, Wencheng ^{[3
]}

Cui, Hui ^{[4
]}

机构：

[1] Fujian Normal Univ, Coll Comp & Cyber Secur, Fuzhou 350007, Peoples R China

[2] Minjiang Univ, Coll Comp & Control Engn, Fuzhou 350108, Peoples R China

[3] Univ Southern Queensland, Sch Math Phys & Comp, Darling Hts, Qld 4350, Australia

[4] Monash Univ, Dept Software Syst & Cybersecur, Melbourne, Vic 3800, Australia

来源：

FUTURE INTERNET | 2024年 / 16卷 / 03期

关键词：

unmanned aerial vehicles (UAVs); meta-reinforcement learning; generative adversarial imitation learning;

D O I：

10.3390/fi16030105

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning (RL) is pivotal in empowering Unmanned Aerial Vehicles (UAVs) to navigate and make decisions efficiently and intelligently within complex and dynamic surroundings. Despite its significance, RL is hampered by inherent limitations such as low sample efficiency, restricted generalization capabilities, and a heavy reliance on the intricacies of reward function design. These challenges often render single-method RL approaches inadequate, particularly in the context of UAV operations where high costs and safety risks in real-world applications cannot be overlooked. To address these issues, this paper introduces a novel RL framework that synergistically integrates meta-learning and imitation learning. By leveraging the Reptile algorithm from meta-learning and Generative Adversarial Imitation Learning (GAIL), coupled with state normalization techniques for processing state data, this framework significantly enhances the model's adaptability. It achieves this by identifying and leveraging commonalities across various tasks, allowing for swift adaptation to new challenges without the need for complex reward function designs. To ascertain the efficacy of this integrated approach, we conducted simulation experiments within both two-dimensional environments. The empirical results clearly indicate that our GAIL-enhanced Reptile method surpasses conventional single-method RL algorithms in terms of training efficiency. This evidence underscores the potential of combining meta-learning and imitation learning to surmount the traditional barriers faced by reinforcement learning in UAV trajectory planning and decision-making processes.

引用

页数：18

共 30 条

[1] Drone Deep Reinforcement Learning: A Review [J].

Azar, Ahmad Taher ;

Koubaa, Anis ;

Ali Mohamed, Nada ;

Ibrahim, Habiba A. ;

Ibrahim, Zahra Fathy ;

Kazim, Muhammad ;

Ammar, Adel ;

Benjdira, Bilel ;

Khamis, Alaa M. ;

Hameed, Ibrahim A. ;

Casalino, Gabriella .

ELECTRONICS, 2021, 10 (09)

[2]

Beck J, 2024, Arxiv, DOI [arXiv:2301.08028, 10.48550/ARXIV.2301.08028]

[3] Model-Based Meta-Reinforcement Learning for Flight With Suspended Payloads [J].

Belkhale, Suneel ;

Li, Rachel ;

Kahn, Gregory ;

McAllister, Rowan ;

Calandra, Roberto ;

Levine, Sergey .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) :1471-1478

[4] Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks [J].

Cui, Jingjing ;

Liu, Yuanwei ;

Nallanathan, Arumugam .

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (02) :729-743

[5]

Eschmann J., 2021, Reinforcement learning algorithms: Analysis and Applications, P25

[6]

Finn C, 2017, PR MACH LEARN RES, V70

[7]

He L, 2020, Arxiv, DOI arXiv:2008.02521

[8]

Ho J, 2016, ADV NEUR IN, V29

[9] Meta-Reinforcement Learning for Trajectory Design in Wireless UAV Networks [J].

Hu, Ye ;

Chen, Mingzhe ;

Saad, Walid ;

Poor, H. Vincent ;

Cui, Shuguang .

2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,

[10] Imitation Learning: A Survey of Learning Methods [J].

Hussein, Ahmed ;

Gaber, Mohamed Medhat ;

Elyan, Eyad ;

Jayne, Chrisina .

ACM COMPUTING SURVEYS, 2017, 50 (02)

← 1 2 3 →