Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV With Limited Environment Knowledge

被引：179

作者：

Singla, Abhik ^{[1
]}

Padakandla, Sindhu ^{[2
]}

Bhatnagar, Shalabh ^{[2
]}

机构：

[1] Indian Inst Sci, Robert Bosch Ctr Cyber Phys Syst, Bangalore 560012, Karnataka, India

[2] Indian Inst Sci, Dept Comp Sci & Automat, Bengaluru 560012, India

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2021年 / 22卷 / 01期

关键词：

Collision avoidance; Navigation; Cameras; Unmanned aerial vehicles; Simultaneous localization and mapping; Visualization; Unmanned aerial vehicle (UAV) obstacle avoidance (OA); deep reinforcement learning (DRL); partial observability; deep Q-networks (DQN);

D O I：

10.1109/TITS.2019.2954952

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

This paper presents our method for enabling a UAV quadrotor, equipped with a monocular camera, to autonomously avoid collisions with obstacles in unstructured and unknown indoor environments. When compared to obstacle avoidance in ground vehicular robots, UAV navigation brings in additional challenges because the UAV motion is no more constrained to a well-defined indoor ground or street environment. Unlike ground vehicular robots, a UAV has to navigate across more types of obstacles - for e.g., objects like decorative items, furnishings, ceiling fans, sign-boards, tree branches, etc., are also potential obstacles for a UAV. Thus, methods of obstacle avoidance developed for ground robots are clearly inadequate for UAV navigation. Current control methods using monocular images for UAV obstacle avoidance are heavily dependent on environment information. These controllers do not fully retain and utilize the extensively available information about the ambient environment for decision making. We propose a deep reinforcement learning based method for UAV obstacle avoidance (OA) which is capable of doing exactly the same. The crucial idea in our method is the concept of partial observability and how UAVs can retain relevant information about the environment structure to make better future navigation decisions. Our OA technique uses recurrent neural networks with temporal attention and provides better results compared to prior works in terms of distance covered without collisions. In addition, our technique has a high inference rate and reduces power wastage as it minimizes oscillatory motion of UAV.

引用

页码：107 / 118

页数：12

共 41 条

[1]

Aguilar W. G., 2017, ADVANCES IN COMPUTAT, P596

[2] Collision Avoidance for Quadrotors with a Monocular Camera [J].

Alvarez, H. ;

Paz, L. M. ;

Sturm, J. ;

Cremers, D. .

EXPERIMENTAL ROBOTICS, 2016, 109 :195-209

[3]

[Anonymous], 2017, ARXIV170408759

[4]

[Anonymous], 2017, ROBOTICS SCI SYSTEMS

[5]

Berthelot D., 2017, arXiv, DOI DOI 10.48550/ARXIV.1703.10717

[6]

Bertsekas D. P., 2013, DYNAMIC PROGRAMMING, VII

[7]

Chakravarty Punarjay, 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA), P6369, DOI 10.1109/ICRA.2017.7989752

[8] Generative Adversarial Networks An overview [J].

Creswell, Antonia ;

White, Tom ;

Dumoulin, Vincent ;

Arulkumaran, Kai ;

Sengupta, Biswa ;

Bharath, Anil A. .

IEEE SIGNAL PROCESSING MAGAZINE, 2018, 35 (01) :53-65

[9]

Erdelj M, 2017, IEEE PERVAS COMPUT, V16, P24, DOI 10.1109/MPRV.2017.11

[10]

Eric N, 2017, INT CONF UBIQ FUTUR, P531

← 1 2 3 4 5 →