Multi-Task Reinforcement Learning Based on Parallel Recombination Networks

被引:0
作者
Liu, Manlu
Zhang, Qingbo [1 ]
Qian, Weimin
机构
[1] Southwest Univ Sci & Technol, Sch Informat Engn, Mianyang 621010, Peoples R China
关键词
Multitasking; Reinforcement learning; Training; Robots; Manipulators; Optimization; Metaverse; Multi-task reinforcement learning; meta-world; parallel recombination network; robotic manipulation task;
D O I
10.1109/ACCESS.2024.3449072
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-task Reinforcement learning is a key current trend in the field of reinforcement learning. It can accomplish multiple tasks using a single network, which is superior to single-task learning in integrating information from different tasks. However, uncertainty remains on the issue of how to effectively share parameters across tasks in the network. To address this problem, this paper proposes a 'soft parallel recombination network' approach, which can share task information across network layers without being limited to between adjacent layers, thus enhancing the information sharing capability of the network. Specifically, the types of multi-task learning in this paper include various manipulator control tasks executed in the Meta-world environment, such as pick-and-place, push, and stacking. For optimal performance, a weight network is introduced which automatically determines the optimal path for each task and outputs the probability of each module being selected. The proposed method efficiently learns the relationships between tasks from parallel recombination networks and determines the optimal path for a task through a weight network. Further, the weight relationship between the current training samples and the current strategy is found, which improves the training efficiency in combination with the parallel recombination network. By combining the proposed 'Soft Parallel Recombination Network' method with the SAC algorithm (PRSAC) and validating it on the Meta-world multi-task training platform, the experimental results demonstrate that the proposed method significantly outperforms existing baseline algorithms in terms of sample efficiency and performance.
引用
收藏
页码:80113 / 80122
页数:10
相关论文
共 39 条
[1]   Emergency Vehicle Aware Lane Change Decision Model for Autonomous Vehicles Using Deep Reinforcement Learning [J].
Alzubaidi, Ahmed ;
Al Sumaiti, Ameena Saad ;
Byon, Young-Ji ;
Hosani, Khalifa Al .
IEEE ACCESS, 2023, 11 :27127-27137
[2]   Controlling the Solo12 quadruped robot with deep reinforcement learning [J].
Aractingi, Michel ;
Leziart, Pierre-Alexandre ;
Flayols, Thomas ;
Perez, Julien ;
Silander, Tomi ;
Soueres, Philippe .
SCIENTIFIC REPORTS, 2023, 13 (01)
[3]   Deep Reinforcement Learning A brief survey [J].
Arulkumaran, Kai ;
Deisenroth, Marc Peter ;
Brundage, Miles ;
Bharath, Anil Anthony .
IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38
[4]  
Bappi B. R., 2023, P 7 INT C IM INF PRO, P670
[5]  
Birck M. A., 2017, P ENC NAC INT ART CO
[6]   Multitask learning [J].
Caruana, R .
MACHINE LEARNING, 1997, 28 (01) :41-75
[7]   Multi-Task Reinforcement Learning With Attention-Based Mixture of Experts [J].
Cheng, Guangran ;
Dong, Lu ;
Cai, Wenzhe ;
Sun, Changyin .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (06) :3811-3818
[8]   Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay [J].
Cicek, Dogan C. ;
Duran, Enes ;
Saglam, Baturay ;
Mutlu, Furkan B. ;
Kozat, Suleyman S. .
2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, :1255-1262
[9]  
Dewangan P, 2018, Arxiv, DOI arXiv:1802.10463
[10]   A Panoramic Survey on Grasping Research Trends and Topics [J].
Grana, Manuel ;
Alonso, Marcos ;
Izaguirre, Alberto .
CYBERNETICS AND SYSTEMS, 2019, 50 (01) :40-57