Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning

被引：3

作者：

Muller-Brockhausen, Matthias ^{[1
]}

Preuss, Mike ^{[1
]}

Plaat, Aske ^{[1
]}

机构：

[1] Leiden Univ, Leiden Inst Adv Comp Sci, Leiden, Netherlands

来源：

2021 IEEE CONFERENCE ON GAMES (COG) | 2021年

关键词：

Transfer; Reinforcement Learning; Benchmarks; Procedural Content Generation; FRAMEWORK; AI;

D O I：

10.1109/COG52621.2021.9619000

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The idea of transfer in reinforcement learning (TRL) is intriguing: being able to transfer knowledge from one problem to another problem without learning everything from scratch. This promises quicker learning and learning more complex methods. To gain an insight into the field and to detect emerging trends, we performed a database search. We note a surprisingly late adoption of deep learning that starts in 2018. The introduction of deep learning has not yet solved the greatest challenge of TRL: generalization. Transfer between different domains works well when domains have strong similarities (e.g. MountainCar to Cartpole), and most TRL publications focus on different tasks within the same domain that have few differences. Most TRL applications we encountered compare their improvements against self-defined baselines, and the field is still missing unified benchmarks. We consider this to be a disappointing situation. For the future, we note that: (1) A clear measure of task similarity is needed. (2) Generalization needs to improve. Promising approaches merge deep learning with planning via MCTS or introduce memory through LSTMs. (3) The lack of benchmarking tools will be remedied to enable meaningful comparison and measure progress. Already Alchemy and Meta-World are emerging as interesting benchmark suites. We note that another development, the increase in procedural content generation (PCG), can improve both benchmarking and generalization in TRL.

引用

页码：924 / 931

页数：8

共 78 条

[1]

Ammar H. B., 2014, WORKSH 28 AAAI C ART

[2]

Ammar H. B., 2014, WORKSHOPS 28 AAAI C

[3]

Ammar H. B., 2015, P AAAI C ARTIFICIAL, V29

[4]

[Anonymous], 2010, 9 INT C AUT AG MULT

[5] Autonomous Navigation via Deep Reinforcement Learning for Resource Constraint Edge Nodes Using Transfer Learning [J].

Anwar, Aqeel ;

Raychowdhury, Arijit .

IEEE ACCESS, 2020, 8 :26549-26560

[6]

Arnekvist I, 2019, IEEE INT CONF ROBOT, P36, DOI [10.1109/ICRA.2019.8793556, 10.1109/icra.2019.8793556]

[7]

Baker Bowen, 2020, INT C LEARN REPR ICL

[8]

Barekatain M, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P3108

[9]

Berner C., 2019, arXiv

[10] Fast unfolding of communities in large networks [J].

Blondel, Vincent D. ;

Guillaume, Jean-Loup ;

Lambiotte, Renaud ;

Lefebvre, Etienne .

JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,

← 1 2 3 4 5 6 7 8 →