T3S: Improving Multi-Task Reinforcement Learning with Task-Specific Feature Selector and Scheduler

被引：0

作者：

Yu, Yuanqiang ^{[1
]}

Yang, Tianpei ^{[2
,3
]}

Lv, Yongliang ^{[1
]}

Zheng, Yan ^{[1
]}

Hao, Jianye ^{[1
]}

机构：

[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China

[2] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada

[3] Alberta Machine Intelligence Inst, Edmonton, AB, Canada

来源：

2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年

关键词：

reinforcement learning; multi-task learning; knowledge sharing; task scheduler;

D O I：

10.1109/IJCNN54540.2023.10191536

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-task reinforcement learning (MTRL) is a technique to train multiple tasks simultaneously, where previous works usually train a single model to solve different tasks by sharing parameters across various tasks. However, these methods are faced with inter-task interference since what parameters should be shared across tasks is not addressed, dramatically reducing learning efficiency. To solve these problems, we propose a novel MTRL framework called Task-Specific feature Selector and Scheduler (T3S), which consists of two components: a feature selector and a task scheduler. Specifically, the feature selectors employ hypernetworks to construct task-specific soft masks, which can be applied by globally shared representation to construct task-specific features. The task scheduler selects tasks for learning through two metrics, where the selection probability is inversely proportional to task progress (e.g., success rate) and task learning speed. Experimental results show that T3S consistently outperforms the state-of-the-art MTRL algorithms on various robotics manipulation tasks.

引用

页数：8

共 37 条

[1]

[Anonymous], 2017, Electron. Imaging, DOI 10.2352/

[2]

Bengio Yoshua, 2009, P 26 ANN INT C MACH, DOI DOI 10.1145/1553374.1553380

[3]

Berner Christopher, 2019, Dota 2 with large scale deep reinforcement learning

[4]

Chen Z., 2020, Adv. Neural Inf. Process. Syst., V33

[5]

Crawshaw M., 2020, arXiv2009.09796

[6]

Fernando Chrisantha, 2017, arXiv

[7]

Fu HT, 2021, AAAI CONF ARTIF INTE, V35, P7457

[8]

Ha D., 2017, INT C LEARN REPR ICL

[9]

Haarnoja T., 2018, arXiv

[10] Continual Model-Based Reinforcement Learning with Hypernetworks [J].

Huang, Yizhou ;

Xie, Kevin ;

Bharadhwaj, Homanga ;

Shkurti, Florian .

2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, :799-805

← 1 2 3 4 →