Reinforcement Learning Mission Supervisor Design for Behavior-based Differential Drive Robots

被引:0
|
作者
Zhang, Zhenyi [1 ,2 ]
Huang, Jie [1 ,2 ]
机构
[1] School of Electrical Engineering and Automation, Fuzhou University, Fuzhou,350108, China
[2] 5G+ Industrial Internet Institute of Fuzhou University, Fuzhou,350108, China
来源
Jiqiren/Robot | 2024年 / 46卷 / 04期
关键词
Reinforcement learning;
D O I
10.13973/j.cnki.robot.230148
中图分类号
学科分类号
摘要
A multi-agent reinforcement learning mission supervisor (MARLMS) is designed for differential drive robots using trial-and-error learning. The proposed MARLMS addresses the challenge inherent in behavior-based multi-agent systems, wherein the design of switching rules to determine behavior priorities relies heavily on human intelligence. Building upon the null-space-based behavioral control (NSBC) framework, a differential model is introduced to replace the particle model. Consequently, a paradigm of NSBC with nonholonomic constraints is presented for the first time, enhancing the system robustness to the minimum extremum state. Subsequently, a joint policy is developed to dynamically and intelligently determine behavior priorities by modeling the behavior priority switching problem as a cooperative Markov game. The proposed MARLMS not only eliminates the need for manual design of switching rules but also reduces the computational and storage burdens during online operations. Simulation results demonstrate the superior behavior priority switching performance of the proposed MARLMS. Furthermore, successful implementation on AgileX Limo robots validates the practicality of the proposed MARLMS. © 2024 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:397 / 416
相关论文
共 50 条
  • [21] A framework for plan execution in behavior-based robots
    Hertzberg, J
    Jaeger, H
    Zimmer, U
    Morignot, P
    JOINT CONFERENCE ON THE SCIENCE AND TECHNOLOGY OF INTELLIGENT SYSTEMS, 1998, : 8 - 13
  • [22] Walking Parameters Design of Biped Robots based on Reinforcement Learning
    Liang Zhiwei
    Zhu Songhao
    Jin Xin
    2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 4017 - 4022
  • [24] Learning from History for Behavior-Based Mobile Robots in Non-Stationary Conditions
    François Michaud
    Maja J. Matarić
    Autonomous Robots, 1998, 5 : 335 - 354
  • [25] Learning from history for behavior-based mobile robots in non-stationary conditions
    Michaud, F
    Mataric, MJ
    MACHINE LEARNING, 1998, 31 (1-3) : 141 - 167
  • [26] Learning from History for Behavior-Based Mobile Robots in Non-Stationary Conditions
    François Michaud
    Maja J. Matarić
    Machine Learning, 1998, 31 : 141 - 167
  • [27] Learning from history for behavior-based mobile robots in non-stationary conditions
    Michaud, F
    Mataric, MJ
    AUTONOMOUS ROBOTS, 1998, 5 (3-4) : 335 - 354
  • [28] Music recommender using deep embedding-based features and behavior-based reinforcement learning
    Chang, Jia-Wei
    Chiou, Ching-Yi
    Liao, Jia-Yi
    Hung, Ying-Kai
    Huang, Chien-Che
    Lin, Kuan-Cheng
    Pu, Ying-Hung
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (26-27) : 34037 - 34064
  • [29] Music recommender using deep embedding-based features and behavior-based reinforcement learning
    Jia-Wei Chang
    Ching-Yi Chiou
    Jia-Yi Liao
    Ying-Kai Hung
    Chien-Che Huang
    Kuan-Cheng Lin
    Ying-Hung Pu
    Multimedia Tools and Applications, 2021, 80 : 34037 - 34064
  • [30] A behavior-based framework for safe deployment of humanoid robots
    Nicola Scianca
    Paolo Ferrari
    Daniele De Simone
    Leonardo Lanari
    Giuseppe Oriolo
    Autonomous Robots, 2021, 45 : 435 - 456