LevDoom: A Benchmark for Generalization on Level Difficulty in Reinforcement Learning

被引:3
作者
Tomilin, Tristan [1 ]
Dai, Tianhong [2 ]
Fang, Meng [1 ]
Pechenizkiy, Mykola [1 ]
机构
[1] Eindhoven Univ Technol, Eindhoven, Netherlands
[2] Imperial Coll London, London, England
来源
2022 IEEE CONFERENCE ON GAMES, COG | 2022年
关键词
reinforcement learning; generalization; vizdoom; ENVIRONMENT;
D O I
10.1109/CoG51982.2022.9893707
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Despite the recent success of deep reinforcement learning (RL), the generalization ability of RL agents remains an open problem for real-world applicability. RL agents trained on pixels may completely be derailed from achieving their objectives in unseen situations with different levels of visual changes. However, numerous existing RL suites do not address this as a primary objective or lack consistent level design of increased complexity. In this paper, we introduce the LevDoom benchmark, a suite containing semi-realistic 3D simulation environments with coherent levels of difficulty in the renowned video game Doom, designed to benchmark generalization in vision-based RL. We demonstrate how our benchmark reveals weaknesses of some popular Deep RL algorithms, which fail to prevail in modified environments. We further establish how our difficulty level design presents increasing complexity to these algorithms.
引用
收藏
页码:72 / 79
页数:8
相关论文
共 47 条
[1]   Learning dexterous in-hand manipulation [J].
Andrychowicz, Marcin ;
Baker, Bowen ;
Chociej, Maciek ;
Jozefowicz, Rafal ;
McGrew, Bob ;
Pachocki, Jakub ;
Petron, Arthur ;
Plappert, Matthias ;
Powell, Glenn ;
Ray, Alex ;
Schneider, Jonas ;
Sidor, Szymon ;
Tobin, Josh ;
Welinder, Peter ;
Weng, Lilian ;
Zaremba, Wojciech .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2020, 39 (01) :3-20
[2]  
[Anonymous], 1999, QUAKE 3 ARENA
[3]  
Beattie C, 2016, Arxiv, DOI arXiv:1612.03801
[4]   The Arcade Learning Environment: An Evaluation Platform for General Agents [J].
Bellemare, Marc G. ;
Naddaf, Yavar ;
Veness, Joel ;
Bowling, Michael .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2013, 47 :253-279
[5]  
Berner Christopher, 2019, arXiv
[6]  
Beyret B, 2019, Arxiv, DOI arXiv:1909.07483
[7]  
Brockman G, 2016, Arxiv, DOI arXiv:1606.01540
[8]  
Caselles-Dupr‚ H, 2018, Arxiv, DOI arXiv:1809.00510
[9]  
Chevalier-Boisvert M, 2019, Arxiv, DOI arXiv:1810.08272
[10]  
Cobbe K, 2019, PR MACH LEARN RES, V97