Learning Temporal Strategic Relationships using Generative Adversarial Imitation Learning

被引:0
作者
Fernando, Tharindu [1 ]
Denman, Simon [1 ]
Sridharan, Sridha [1 ]
Fookes, Clinton [1 ]
机构
[1] Queensland Univ Technol, Image & Video Res Lab, Brisbane, Qld, Australia
来源
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18) | 2018年
基金
澳大利亚研究理事会;
关键词
Generative Adversarial Imitation Learning; Autonomous Driving; Long term Planing with Autonomous Agents;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel framework for automatic learning of complex strategies in human decision making. The task that we are interested in is to better facilitate long term planning for complex, multi-step events. We observe temporal relationships at the subtask level of expert demonstrations, and determine the different strategies employed in order to successfully complete a task. To capture the relationship between the subtasks and the overall goal, we utilise two external memory modules, one for capturing dependencies within a single expert demonstration, such as the sequential relationship among different sub tasks, and a global memory module for modelling task level characteristics such as best practice employed by different humans based on their domain expertise. Furthermore, we demonstrate how the hidden state representation of the memory can be used as a reward signal to smooth the state transitions, eradicating subtle changes. We evaluate the effectiveness of the proposed model for an autonomous highway driving application, where we demonstrate its capability to learn different expert policies and outperform state-of-the-art methods. The scope in industrial applications extends to any robotics and automation application which requires learning from complex demonstrations containing series of subtasks.
引用
收藏
页码:113 / 121
页数:9
相关论文
共 46 条
[1]   Real time Detection of Lane Markers in Urban Streets [J].
Aly, Mohamed .
2008 IEEE INTELLIGENT VEHICLES SYMPOSIUM, VOLS 1-3, 2008, :165-170
[2]  
[Anonymous], 2017, ADV NEURAL INFORM PR
[3]  
[Anonymous], PROC CVPR IEEE
[4]  
[Anonymous], APPL COMPUTER VISION
[5]  
[Anonymous], 2017, ARXIV170205552
[6]  
[Anonymous], 2017, ARXIV170107875
[7]  
[Anonymous], 2015, P 3 INT C LEARN REPR
[8]  
[Anonymous], 2004, P TWENTYFIRST INT C
[9]  
[Anonymous], 2000, TORCS OPEN RACING CA
[10]  
[Anonymous], 1997, Neural Computation