Transferring knowledge as heuristics in reinforcement learning: A case-based approach

被引:47
作者
Bianchi, Reinaldo A. C. [1 ]
Celiberto, Luiz A., Jr. [2 ]
Santos, Paulo E. [1 ]
Matsuura, Jackson P. [3 ]
Lopez de Mantaras, Ramon [4 ]
机构
[1] Ctr Univ FEI, BR-09850901 Sao Paulo, Brazil
[2] Univ Fed ABC UFABC, Ctr Engn Modelagem & Ciencias Sociais Aplicadas C, BR-09210580 Sao Paulo, Brazil
[3] Technol Inst Aeronaut ITA, BR-12228900 Sao Paulo, Brazil
[4] CSIC, IIIA Artificial Intelligence Res Inst, Spanish Natl Res Council, Bellaterra 08193, Catalonia, Spain
基金
巴西圣保罗研究基金会;
关键词
Case-based reasoning; Reinforcement learning; Transfer learning; SIMILARITY; SELECTION;
D O I
10.1016/j.artint.2015.05.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain. A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:102 / 121
页数:20
相关论文
共 60 条
  • [1] Aha DW, 2009, LECT NOTES ARTIF INT, V5650, P29, DOI 10.1007/978-3-642-02998-1_4
  • [2] Albus J. S., 1975, Transactions of the ASME. Series G, Journal of Dynamic Systems, Measurement and Control, V97, P220, DOI 10.1115/1.3426922
  • [3] [Anonymous], VALUE ADDITION ALGOR
  • [4] [Anonymous], 1989, Ph.D. dissertation
  • [5] [Anonymous], SIMSPARK USERSS MANU
  • [6] [Anonymous], P 9 EUR C REC ADV RE
  • [7] [Anonymous], UNSUPERVISED TRANSFE
  • [8] [Anonymous], 2012, P 11 INT C AUT AG MU
  • [9] [Anonymous], 7 INT JOINT C AUT AG
  • [10] [Anonymous], 2010, P 9 INT C AUT AG MUL