Reinforcement Learning Mission Supervisor Design for Behavior-based Differential Drive Robots

被引：0

作者：

Zhang, Zhenyi ^{[1
,2
]}

Huang, Jie ^{[1
,2
]}

机构：

[1] School of Electrical Engineering and Automation, Fuzhou University, Fuzhou,350108, China

[2] 5G+ Industrial Internet Institute of Fuzhou University, Fuzhou,350108, China

来源：

Jiqiren/Robot | 2024年 / 46卷 / 04期

关键词：

Reinforcement learning;

D O I：

10.13973/j.cnki.robot.230148

中图分类号：

学科分类号：

摘要：

A multi-agent reinforcement learning mission supervisor (MARLMS) is designed for differential drive robots using trial-and-error learning. The proposed MARLMS addresses the challenge inherent in behavior-based multi-agent systems, wherein the design of switching rules to determine behavior priorities relies heavily on human intelligence. Building upon the null-space-based behavioral control (NSBC) framework, a differential model is introduced to replace the particle model. Consequently, a paradigm of NSBC with nonholonomic constraints is presented for the first time, enhancing the system robustness to the minimum extremum state. Subsequently, a joint policy is developed to dynamically and intelligently determine behavior priorities by modeling the behavior priority switching problem as a cooperative Markov game. The proposed MARLMS not only eliminates the need for manual design of switching rules but also reduces the computational and storage burdens during online operations. Simulation results demonstrate the superior behavior priority switching performance of the proposed MARLMS. Furthermore, successful implementation on AgileX Limo robots validates the practicality of the proposed MARLMS. © 2024 Chinese Academy of Sciences. All rights reserved.

引用

页码：397 / 416

共 50 条

[21] A framework for plan execution in behavior-based robots
Hertzberg, J
Jaeger, H
Zimmer, U
Morignot, P
JOINT CONFERENCE ON THE SCIENCE AND TECHNOLOGY OF INTELLIGENT SYSTEMS, 1998, : 8 - 13
[22] Walking Parameters Design of Biped Robots based on Reinforcement Learning
Liang Zhiwei
Zhu Songhao
Jin Xin
2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 4017 - 4022
[23] Learning from history for behavior-based mobile robots in non-stationary conditions
Auton Robots, 3-4 (335-354):
[24] Learning from History for Behavior-Based Mobile Robots in Non-Stationary Conditions
François Michaud
Maja J. Matarić
Autonomous Robots, 1998, 5 : 335 - 354
[25] Learning from history for behavior-based mobile robots in non-stationary conditions
Michaud, F
Mataric, MJ
MACHINE LEARNING, 1998, 31 (1-3) : 141 - 167
[26] Learning from History for Behavior-Based Mobile Robots in Non-Stationary Conditions
François Michaud
Maja J. Matarić
Machine Learning, 1998, 31 : 141 - 167
[27] Learning from history for behavior-based mobile robots in non-stationary conditions
Michaud, F
Mataric, MJ
AUTONOMOUS ROBOTS, 1998, 5 (3-4) : 335 - 354
[28] Music recommender using deep embedding-based features and behavior-based reinforcement learning
Chang, Jia-Wei
Chiou, Ching-Yi
Liao, Jia-Yi
Hung, Ying-Kai
Huang, Chien-Che
Lin, Kuan-Cheng
Pu, Ying-Hung
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (26-27) : 34037 - 34064
[29] Music recommender using deep embedding-based features and behavior-based reinforcement learning
Jia-Wei Chang
Ching-Yi Chiou
Jia-Yi Liao
Ying-Kai Hung
Chien-Che Huang
Kuan-Cheng Lin
Ying-Hung Pu
Multimedia Tools and Applications, 2021, 80 : 34037 - 34064
[30] A behavior-based framework for safe deployment of humanoid robots
Nicola Scianca
Paolo Ferrari
Daniele De Simone
Leonardo Lanari
Giuseppe Oriolo
Autonomous Robots, 2021, 45 : 435 - 456

← 1 2 3 4 5 →