Multi-Task Reinforcement Learning Based on Parallel Recombination Networks

被引：0

作者：

Liu, Manlu

Zhang, Qingbo ^{[1
]}

Qian, Weimin

机构：

[1] Southwest Univ Sci & Technol, Sch Informat Engn, Mianyang 621010, Peoples R China

来源：

IEEE ACCESS | 2025年 / 13卷

关键词：

Multitasking; Reinforcement learning; Training; Robots; Manipulators; Optimization; Metaverse; Multi-task reinforcement learning; meta-world; parallel recombination network; robotic manipulation task;

D O I：

10.1109/ACCESS.2024.3449072

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-task Reinforcement learning is a key current trend in the field of reinforcement learning. It can accomplish multiple tasks using a single network, which is superior to single-task learning in integrating information from different tasks. However, uncertainty remains on the issue of how to effectively share parameters across tasks in the network. To address this problem, this paper proposes a 'soft parallel recombination network' approach, which can share task information across network layers without being limited to between adjacent layers, thus enhancing the information sharing capability of the network. Specifically, the types of multi-task learning in this paper include various manipulator control tasks executed in the Meta-world environment, such as pick-and-place, push, and stacking. For optimal performance, a weight network is introduced which automatically determines the optimal path for each task and outputs the probability of each module being selected. The proposed method efficiently learns the relationships between tasks from parallel recombination networks and determines the optimal path for a task through a weight network. Further, the weight relationship between the current training samples and the current strategy is found, which improves the training efficiency in combination with the parallel recombination network. By combining the proposed 'Soft Parallel Recombination Network' method with the SAC algorithm (PRSAC) and validating it on the Meta-world multi-task training platform, the experimental results demonstrate that the proposed method significantly outperforms existing baseline algorithms in terms of sample efficiency and performance.

引用

页码：80113 / 80122

页数：10

共 39 条

[1] Emergency Vehicle Aware Lane Change Decision Model for Autonomous Vehicles Using Deep Reinforcement Learning [J].