A multi-task deep reinforcement learning framework based on curriculum learning and policy distillation for quadruped robot motor skill training

被引:0
作者
Chen, Liang [1 ,2 ]
Shen, Bo [1 ,2 ]
Hong, Jiale [1 ,2 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai, Peoples R China
[2] Minist Educ, Engn Res Ctr Digitalized Text & Fash Technol, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Reinforcement learning; multi-task; policy distillation; curriculum; quadrupedal robots; OPTIMIZATION;
D O I
10.1080/21642583.2025.2498914
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning (RL) approaches are increasingly prominent in the field of robotics due to their adaptive decision-making capability. However, developing a single RL agent capable of performing multiple continuous control tasks for quadruped robots remains challenging. In this paper, a multi-task deep RL framework based on curriculum learning and policy distillation is proposed, which aims to enhance the quadruped robot's motor performance across multiple continuous tasks by leveraging knowledge from expert skill teachers. The main novelties of the framework lie in the self-optimizing terrain curriculum learning strategy and the improved distillation loss function. The proposed self-optimizing terrain curriculum learning strategy for quadrupedal robots is designed to utilize Bayesian optimization to predict potential training terrains, thus effectively identifying the most suitable training curriculum. Additionally, the improved distillation loss function for RL weight optimization is proposed to enhance the transferability of the trained policy across diverse tasks. To validate the effectiveness of the proposed multi-task deep RL framework, the performance of the policy generated by the framework across diverse terrains is assessed. The experimental results demonstrate that the proposed multi-task deep RL framework could generate a unified policy that achieves excellent performance across multiple continuous control tasks for quadruped robots.
引用
收藏
页数:13
相关论文
共 42 条
[1]  
Agarwal A., 2022, P C ROB LEARN
[2]   Curriculum-Based Reinforcement Learning for Quadrupedal Jumping: A Reference-Free Design [J].
Atanassov, Vassil ;
Ding, Jiatao ;
Kober, Jens ;
Havoutis, Ioannis ;
Della Santina, Cosimo .
IEEE ROBOTICS & AUTOMATION MAGAZINE, 2025, 32 (02) :35-48
[3]   Trajectory tracking controller design for wheeled Mobile robot with velocity and torque constraints [J].
Bai, Jianjun ;
Yang, Zexin ;
Li, Zuxin ;
Shen, Chaojie ;
Chen, Yun ;
Li, Jianning .
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2024, 55 (14) :2825-2837
[4]  
Banerjee R., 2023, 2023 IEEE RSJ INT C
[5]  
Benechehab A, 2024, Arxiv, DOI arXiv:2402.03146
[6]  
Chen Dian., 2020, C ROBOT LEARNING, P66
[7]   Extreme Parkour with Legged Robots [J].
Cheng, Xuxin ;
Shi, Kexin ;
Agarwal, Ananye ;
Pathak, Deepak .
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, :11443-11450
[8]   Neural network-based parametric system identification: a review [J].
Dong, Aoxiang ;
Starr, Andrew ;
Zhao, Yifan .
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2023, 54 (13) :2676-2688
[9]  
Guo XW, 2023, INT J NETW DYN INTEL, V2, P1, DOI [10.53941/ijndi0201001, 10.53941/ijndi0201001]
[10]  
Haarnoja T, 2018, PR MACH LEARN RES, V80