T3S: Improving Multi-Task Reinforcement Learning with Task-Specific Feature Selector and Scheduler

被引:0
作者
Yu, Yuanqiang [1 ]
Yang, Tianpei [2 ,3 ]
Lv, Yongliang [1 ]
Zheng, Yan [1 ]
Hao, Jianye [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
[2] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
[3] Alberta Machine Intelligence Inst, Edmonton, AB, Canada
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
关键词
reinforcement learning; multi-task learning; knowledge sharing; task scheduler;
D O I
10.1109/IJCNN54540.2023.10191536
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-task reinforcement learning (MTRL) is a technique to train multiple tasks simultaneously, where previous works usually train a single model to solve different tasks by sharing parameters across various tasks. However, these methods are faced with inter-task interference since what parameters should be shared across tasks is not addressed, dramatically reducing learning efficiency. To solve these problems, we propose a novel MTRL framework called Task-Specific feature Selector and Scheduler (T3S), which consists of two components: a feature selector and a task scheduler. Specifically, the feature selectors employ hypernetworks to construct task-specific soft masks, which can be applied by globally shared representation to construct task-specific features. The task scheduler selects tasks for learning through two metrics, where the selection probability is inversely proportional to task progress (e.g., success rate) and task learning speed. Experimental results show that T3S consistently outperforms the state-of-the-art MTRL algorithms on various robotics manipulation tasks.
引用
收藏
页数:8
相关论文
共 37 条
[1]  
[Anonymous], 2017, Electron. Imaging, DOI 10.2352/
[2]  
Bengio Yoshua, 2009, P 26 ANN INT C MACH, DOI DOI 10.1145/1553374.1553380
[3]  
Berner Christopher, 2019, Dota 2 with large scale deep reinforcement learning
[4]  
Chen Z., 2020, Adv. Neural Inf. Process. Syst., V33
[5]  
Crawshaw M., 2020, arXiv2009.09796
[6]  
Fernando Chrisantha, 2017, arXiv
[7]  
Fu HT, 2021, AAAI CONF ARTIF INTE, V35, P7457
[8]  
Ha D., 2017, INT C LEARN REPR ICL
[9]  
Haarnoja T., 2018, arXiv
[10]   Continual Model-Based Reinforcement Learning with Hypernetworks [J].
Huang, Yizhou ;
Xie, Kevin ;
Bharadhwaj, Homanga ;
Shkurti, Florian .
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, :799-805