Transferring knowledge as heuristics in reinforcement learning: A case-based approach

被引：48

作者：

Bianchi, Reinaldo A. C. ^{[1
]}

Celiberto, Luiz A., Jr. ^{[2
]}

Santos, Paulo E. ^{[1
]}

Matsuura, Jackson P. ^{[3
]}

Lopez de Mantaras, Ramon ^{[4
]}

机构：

[1] Ctr Univ FEI, BR-09850901 Sao Paulo, Brazil

[2] Univ Fed ABC UFABC, Ctr Engn Modelagem & Ciencias Sociais Aplicadas C, BR-09210580 Sao Paulo, Brazil

[3] Technol Inst Aeronaut ITA, BR-12228900 Sao Paulo, Brazil

[4] CSIC, IIIA Artificial Intelligence Res Inst, Spanish Natl Res Council, Bellaterra 08193, Catalonia, Spain

来源：

ARTIFICIAL INTELLIGENCE | 2015年 / 226卷

基金：

巴西圣保罗研究基金会;

关键词：

Case-based reasoning; Reinforcement learning; Transfer learning; SIMILARITY; SELECTION;

D O I：

10.1016/j.artint.2015.05.008

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain. A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms. (C) 2015 Elsevier B.V. All rights reserved.

引用

页码：102 / 121

页数：20

共 60 条

[1]

Aha DW, 2009, LECT NOTES ARTIF INT, V5650, P29, DOI 10.1007/978-3-642-02998-1_4

[2]

Albus J. S., 1975, Transactions of the ASME. Series G, Journal of Dynamic Systems, Measurement and Control, V97, P220, DOI 10.1115/1.3426922

[3]

[Anonymous], VALUE ADDITION ALGOR

[4]

[Anonymous], 1989, Ph.D. dissertation

[5]

[Anonymous], SIMSPARK USERSS MANU

[6]

[Anonymous], P 9 EUR C REC ADV RE

[7]

[Anonymous], UNSUPERVISED TRANSFE

[8]

[Anonymous], 2012, P 11 INT C AUT AG MU

[9]

[Anonymous], 7 INT JOINT C AUT AG

[10]

[Anonymous], 2010, P 9 INT C AUT AG MUL

← 1 2 3 4 5 6 →