Learning strategies for underwater robot autonomous manipulation control

被引：2

作者：

Huang, Hai ^{[1
]}

Jiang, Tao ^{[1
]}

Zhang, Zongyu ^{[1
]}

Sun, Yize ^{[1
]}

Qin, Hongde ^{[1
]}

Li, Xinyang ^{[1
]}

Yang, Xu ^{[2
]}

机构：

[1] Harbin Engn Univ, Natl Key Lab Sci & Technol Underwater Vehicle, Harbin 150001, Peoples R China

[2] Inst Automat, Chinese Acad Sci, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

来源：

JOURNAL OF THE FRANKLIN INSTITUTE | 2024年 / 361卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Underwater vehicle manipulator system; Strategy learning; Autonomous manipulation; Deep reinforcement learning; Transferred learning; REINFORCEMENT; VEHICLES;

D O I：

10.1016/j.jfranklin.2024.106773

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Autonomous manipulation operations represent the high intelligent coordination from robotic vision and control, it is also a symbol of the advances of robotic intelligence. The limitations of visual sensing and the increasingly complex experimental conditions make autonomous manipulation operations more difficult, particularly for deep reinforcement learning methods, which can enhance robotic control intelligence but require a lot of training process. Due to the highdimensional continuous state space and continuous action space characteristics of underwater operations, this paper adopts a policy -based reinforcement learning method as the foundational approach. To address the issues of instability and low convergence efficiency in traditional policybased reinforcement learning algorithms during the learning process, this paper proposes a novel policy learning method. This method adopts the Proximal Policy Optimization algorithm (PPO - Clip) and optimizes it through an actor -critic network. The aim is to improve the stability and effectiveness of convergence in the learning process. In the underwater training environment, a new reward shaping scheme has been designed to address the issue of reward sparsity during the training process. The manually crafted dense reward function is utilized as attractive and repulsive potential functions for goal manipulation and obstacle avoidance. On the highly complex underwater manipulation and training environment, transferred learning algorithm has been established to reduce the training times and compensate the differences between the simulation and experiment. Simulations and tank experiments have verified the performance of the proposed strategy learning method.

引用

页数：17

共 41 条

[1] Deep Reinforcement Learning A brief survey [J].

Arulkumaran, Kai ;

Deisenroth, Marc Peter ;

Brundage, Miles ;

Bharath, Anil Anthony .

IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38

[2] Augmented reality visualization of scene depth for aiding ROV pilots in underwater manipulation [J].

Bruno, Fabio ;

Lagudi, Antonio ;

Barbieri, Loris ;

Rizzo, Domenico ;

Muzzupappa, Maurizio ;

De Napoli, Luigi .

OCEAN ENGINEERING, 2018, 168 :140-154

[3] Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning [J].

Carlucho, Ignacio ;

De Paula, Mariano ;

Wang, Sen ;

Petillot, Yvan ;

Acosta, Gerardo G. .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2018, 107 :71-86

[4] A free floating manipulation strategy for Autonomous Underwater Vehicles [J].

Conti, R. ;

Fanelli, F. ;

Meli, E. ;

Ridolfi, A. ;

Costanzi, R. .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2017, 87 :133-146

[5] Two-step gradient-based reinforcement learning for underwater robotics behavior learning [J].

El-Fakdi, Andres ;

Carreras, Marc .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2013, 61 (03) :271-282

[6] Knowledge-based reasoning from human grasp demonstrations for robot grasp synthesis [J].

Faria, Diego R. ;

Trindade, Pedro ;

Lobo, Jorge ;

Dias, Jorge .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2014, 62 (06) :794-817

[7] A soft manipulator for efficient delicate grasping in shallow water: Modeling, control, and real-world experiments [J].

Gong, Zheyuan ;

Fang, Xi ;

Chen, Xingyu ;

Cheng, Jiahui ;

Xie, Zhexin ;

Liu, Jiaqi ;

Chen, Bohan ;

Yang, Hui ;

Kong, Shihan ;

Hao, Yufei ;

Wang, Tianmiao ;

Yu, Junzhi ;

Wen, Li .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2021, 40 (01) :449-469

[8] Three Birds, One Stone: Unified Laser-Based 3-D Reconstruction Across Different Media [J].

Gu, Changjun ;

Cong, Yang ;

Sun, Gan .

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70

[9] Monocular vision-based gripping of objects [J].

Haugalokken, Bent Oddvar Arnesen ;

Skaldebo, Martin Breivik ;

Schjolberg, Ingrid .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2020, 131

[10] Functional Contour-following via Haptic Perception and Reinforcement Learning [J].

Hellman, Randall B. ;

Tekin, Cem ;

van der Schaar, Mihaela ;

Santos, Veronica J. .

IEEE TRANSACTIONS ON HAPTICS, 2018, 11 (01) :61-72

← 1 2 3 4 5 →