Reinforcement Learning Based Obstacle Avoidance for Autonomous Underwater Vehicle

被引：49

作者：

Bhopale, Prashant ^{[1
]}

Kazi, Faruk ^{[1
]}

Singh, Navdeep ^{[1
]}

机构：

[1] Veermata Jijabai Technol Inst, Elect Engn Dept, Mumbai 400019, Maharashtra, India

来源：

JOURNAL OF MARINE SCIENCE AND APPLICATION | 2019年 / 18卷 / 02期

关键词：

Obstacle avoidance; Autonomous underwater vehicle; Reinforcement learning; Q-learning; Function approximation;

D O I：

10.1007/s11804-019-00089-3

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

Obstacle avoidance becomes avery challenging task for anautonomous underwater vehicle (AUV) in an unknown underwater environment during exploration process. Successful control in such case may be achieved using the model-based classical control techniques like PID and MPC but it required an accurate mathematical model of AUV and may fail due to parametric uncertainties, disturbance, or plant model mismatch. On the other hand, model-free reinforcement learning (RL) algorithm can be designed using actual behavior of AUV plant in an unknown environment and the learned control may not get affected by model uncertainties like a classical control approach. Unlike model-based control model-free RL based controller does not require to manually tune controller with the changing environment. A standard RL based one-step Q-learning based control can be utilized for obstacle avoidance but it has tendency to explore all possible actions at given state which may increase number of collision. Hence a modified Q-learning based control approach is proposed to deal with these problems in unknown environment. Furthermore, function approximation is utilized using neural network (NN) to overcome the continuous states and large state-space problems which arise in RL-based controller design. The proposed modified Q-learning algorithm is validated using MATLAB simulations by comparing it with standard Q-learning algorithm for single obstacle avoidance. Also, the same algorithm is utilized to deal with multiple obstacle avoidance problems.

引用

页码：228 / 238

页数：11

共 17 条

[1]

[Anonymous], 2015, P MTS IEEE OCEANS

[2]

[Anonymous], 1996, UNDERWATER VEHICLES, P1

[3]

Bajaria P., 2017, ADV COMPUTATIONAL ME, P60

[4]

Bhopale P. S., 2016, 2016 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), P477, DOI 10.1109/ICCICCT.2016.7987997

[5] H∞ robust fault-tolerant controller design for an autonomous underwater vehicle's navigation control system [J].

Cheng X.-Q. ;

Qu J.-Y. ;

Yan Z.-P. ;

Bian X.-Q. .

Journal of Marine Science and Application, 2010, 9 (1) :87-92

[6]

Fossen T, 2011, HDB MARINE CRAFT HYD, P15

[7] Reinforcement learning in feedback control Challenges and benchmarks from technical process control [J].

Hafner, Roland ;

Riedmiller, Martin .

MACHINE LEARNING, 2011, 84 (1-2) :137-169

[8] Reinforcement learning in robotics: A survey [J].

Kober, Jens ;

Bagnell, J. Andrew ;

Peters, Jan .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1238-1274

[9]

Powell WB, 2007, APPROXIMATE DYNAMIC PROGRAMMING: SOLVING THE CURSES OF DIMENSIONALITY, P1

[10]

Prestero T, 2001, VERIFICATION 6 DEGRE, P1

← 1 2 →