Automatic Curriculum Graph Generation for Reinforcement Learning Agents

被引:0
作者
Svetlik, Maxwell [1 ]
Leonetti, Matteo [2 ]
Sinapov, Jivko [1 ]
Shah, Rishi [1 ]
Walker, Nick [1 ]
Stone, Peter [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
[2] Univ Leeds, Leeds, W Yorkshire, England
来源
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2017年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, research has shown that transfer learning methods can be leveraged to construct curricula that sequence a series of simpler tasks such that performance on a final target task is improved. A major limitation of existing approaches is that such curricula are handcrafted by humans that are typically domain experts. To address this limitation, we introduce a method to generate a curriculum based on task descriptors and a novel metric of transfer potential. Our method automatically generates a curriculum as a directed acyclic graph (as opposed to a linear sequence as done in existing work). Experiments in both discrete and continuous domains show that our method produces curricula that improve the agent's learning performance when compared to the baseline condition of learning on the target task from scratch.
引用
收藏
页码:2590 / 2596
页数:7
相关论文
共 22 条
[1]  
[Anonymous], 2006, P 23 INT C MACHINE L
[2]  
[Anonymous], 2016, BROWN UMBC REINFORCE
[3]  
[Anonymous], 2016, P INT MACH LEARN WOR
[4]  
[Anonymous], 2014, WORKSH 28 AAAI C ART
[5]  
[Anonymous], 2003, P 20 INT C MACH LEAR
[6]  
Asada Minoru, 1996, RECENT ADV ROBOT LEA, P163
[7]   BISIMULATION METRICS FOR CONTINUOUS MARKOV DECISION PROCESSES [J].
Ferns, Norm ;
Panangaden, Prakash ;
Precup, Doina .
SIAM JOURNAL ON COMPUTING, 2011, 40 (06) :1662-1714
[8]  
Ferns Norman, 2012, ARXIV12066836
[9]  
Isele D., 2016, P 25 INT JOINT C ART
[10]  
Lazaric A, 2012, ADAPT LEARN OPTIM, V12, P143