Representation Reinforcement Learning-Based Dense Control for Point Following With State Sparse Sensing of 3-D Snake Robots

被引：1

作者：

Liu, Lixing ^{[1
,2
]}

Liu, Jiashun ^{[3
]}

Guo, Xian ^{[1
,2
]}

Huang, Wei ^{[1
,2
]}

Fang, Yongchun ^{[1
,2
]}

Hao, Jianye ^{[3
]}

机构：

[1] Nankai Univ, Coll Artificial Intelligence, Inst Robot & Automat Informat Syst, Tianjin 300350, Peoples R China

[2] Nankai Univ, Tianjin Key Lab Intelligent Robot, Tianjin 300350, Peoples R China

[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China

来源：

IEEE-ASME TRANSACTIONS ON MECHATRONICS | 2024年

基金：

中国国家自然科学基金;

关键词：

Robot sensing systems; Snake robots; Robots; Sensors; Motion control; Training; Crawlers; Standards; Process control; Optimization; 3-D snake robots; biomimetic robots; dense motion control; representation reinforcement learning (RRL); sparse state sensing; GAIT DESIGN;

D O I：

10.1109/TMECH.2024.3465018

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

During robot movements, the environmental states often fail to update in real-time due to interference from various factors, such as obstacle obstructions, communication disruptions, etc., which commonly results in interruptions or even failures in motion control. To achieve dense motion control under sparse state sensing, an important challenge is to predict future multiple actions based on sparse states, which is hindered by the large and complex action search space. Unfortunately, limited research has been dedicated to addressing this challenge. Therefore, this article proposes a representation reinforcement learning (RRL) based solution, called Sparse-State to Dense-Actions Latent Control, designed to realize dense motion control of 3-D snake robots subject to sparse environmental state sensing, which guarantees satisfactory point following performance. In particular, by introducing a latent representation of multiple actions, the control policy optimizes latent actions to predict dense motion gaits, which significantly enhances training efficiency and motion performance. Meanwhile, to learn a compact latent variable model, three mechanisms are proposed to ensure its efficient training, semantic smoothness, and energy efficiency, facilitating exploration of the RL algorithm. To the best of our knowledge, this article provides the first solution that enables a 3-D snake robot to successfully accomplish point following tasks under sparse state sensing. Simulation and practical experiments confirm the effectiveness, robustness, and generalizability of the proposed algorithm, with all following errors less than 0.02 m.

引用

页数：11

共 32 条

[1] Modulation of orthogonal body waves enables high maneuverability in sidewinding locomotion
Astley, Henry C.
Gong, Chaohui
Dai, Jin
Travers, Matthew
Serrano, Miguel M.
Vela, Patricio A.
Choset, Howie
Mendelson, Joseph R., III
Hu, David L.
Goldman, Daniel I.
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (19) : 6200 - 6205
[2] ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters
Bin Peng, Xue
Guo, Yunrong
Halper, Lina
Levine, Sergey
Fidler, Sanja
[J]. ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (04):
[3] Perception-Action Coupling Target Tracking Control for a Snake Robot via Reinforcement Learning
Bing, Zhenshan
Lemke, Christian
Morin, Fabric O.
Jiang, Zhuangyi
Cheng, Long
Huang, Kai
Knoll, Alois
[J]. FRONTIERS IN NEUROROBOTICS, 2020, 14
[4] Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning
Bing, Zhenshan
Lemke, Christian
Cheng, Long
Huang, Kai
Knoll, Alois
[J]. NEURAL NETWORKS, 2020, 129 : 323 - 333
[5] Direction Control and Adaptive Path Following of 3-D Snake-Like Robot Motion
Cao, Zhengcai
Zhang, Dong
Zhou, MengChu
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (10) : 10980 - 10987
[6] Robust Neuro-Optimal Control of Underactuated Snake Robots With Experience Replay
Cao, Zhengcai
Xiao, Qing
Huang, Ran
Zhou, Mengchu
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (01) : 208 - 217
[7] Target tracking control of a bionic mantis shrimp robot with closed-loop central pattern generators
Chen, Gang
Xu, Yidong
Yang, Xin
Hu, Huosheng
Cheng, Hao
Zhu, Lvyuan
Zhang, Jingjing
Shi, Jianwei
Chai, Xinxue
[J]. OCEAN ENGINEERING, 2024, 297
[8] Frequency modulation of body waves to improve performance of sidewinding robots
Chong, Baxi
Wang, Tianyu
Rieser, Jennifer M.
Lin, Bo
Kaba, Abdul
Blekherman, Grigoriy
Choset, Howie
Goldman, Daniel I.
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2021, 40 (12-14) : 1547 - 1562
[9] Fujimoto S, 2018, PR MACH LEARN RES, V80
[10] Gupta A., 2020, P C ROB LEARN, V100, P1025

← 1 2 3 4 →