State-Space Closure: Revisiting Endless Online Level Generation via Reinforcement Learning

被引：0

作者：

Wang, Ziqi ^{[1
,2
]}

Shu, Tianye ^{[1
,2
]}

Liu, Jialin ^{[1
,2
]}

机构：

[1] Southern Univ Sci & Technol SUSTech, Dept Comp Sci & Engn, Guangdong Prov Key Lab Brain Inspired Intelligent, Shenzhen 518055, Peoples R China

[2] Southern Univ Sci & Technol SUSTech, Res Inst Trustworthy Autonomous Syst, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON GAMES | 2024年 / 16卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Games; Training; Reinforcement learning; Generators; Deep learning; Visualization; Hamming distances; Content diversity; online level generation (OLG); platformer games; procedural content generation (PCG); PCG via reinforcement learning (RL);

D O I：

10.1109/TG.2023.3262297

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this letter, we revisit endless online level generation with the recently proposed experience-driven procedural content generation via reinforcement learning (EDRL) framework. Inspired by an observation that EDRL tends to generate recurrent patterns, we formulate a notion of state-space closure, which makes any stochastic state appeared possibly in an infinite-horizon online generation process, that can be found within a finite horizon. Through theoretical analysis, we find that even though state-space closure arises a concern about diversity, it generalizes EDRL trained with a finite horizon to the infinite-horizon scenario without deterioration of content quality. Moreover, we verify the quality and the diversity of contents generated by EDRL via empirical studies on the widely used Super Mario Bros. benchmark. Experimental results reveal that the diversity of levels generated by EDRL is limited due to the state-space closure, whereas their quality does not deteriorate in a horizon that is longer than the one specified in the training. Concluding our outcomes and analysis, future work on endless online level generation via reinforcement learning should address the issue of diversity while assuring the occurrence of state-space closure and quality.

引用

页码：489 / 492

页数：4

共 50 条

[1] State-space segmentation for faster training reinforcement learning
Kim, Jongrae
IFAC PAPERSONLINE, 2022, 55 (25): : 235 - 240
[2] Anomaly detection using state-space models and reinforcement learning
Khazaeli, Shervin
Nguyen, Luong Ha
Goulet, James A.
STRUCTURAL CONTROL & HEALTH MONITORING, 2021, 28 (06)
[3] Online state space generation by a growing self-organizing map and differential learning for reinforcement learning
Notsu, Akira
Yasuda, Koji
Ubukata, Seiki
Honda, Katsuhiro
APPLIED SOFT COMPUTING, 2020, 97
[4] Procedural Level Generation for Sokoban via Deep Learning: An Experimental Study
Zakaria, Yahia
Fayek, Magda
Hadhoud, Mayada
IEEE TRANSACTIONS ON GAMES, 2023, 15 (01) : 108 - 120
[5] Q-Learning Acceleration via State-space Partitioning
Wei, Haoran
Corder, Kevin
Decker, Keith
2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 293 - 298
[6] STATE-SPACE CHARACTERIZATION OF HUMAN BALANCE THROUGH A REINFORCEMENT LEARNING BASED MUSCLE CONTROLLER
Akbas, Kubra
Zhou, Xianlian
PROCEEDINGS OF ASME 2023 INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, IDETC-CIE2023, VOL 2, 2023,
[7] Potential-based reward shaping using state-space segmentation for efficiency in reinforcement learning
Bal, Melis Ilayda
Aydin, Hueseyin
Iyiguen, Cem
Polat, Faruk
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 157 : 469 - 484
[8] Online Gaussian Process State-space Model: Learning and Planning for Partially Observable Dynamical Systems
Soon-Seo Park
Young-Jin Park
Youngjae Min
Han-Lim Choi
International Journal of Control, Automation and Systems, 2022, 20 : 601 - 617
[9] Online Gaussian Process State-space Model: Learning and Planning for Partially Observable Dynamical Systems
Park, Soon-Seo
Park, Young-Jin
Min, Youngjae
Choi, Han-Lim
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2022, 20 (02) : 601 - 617
[10] Dynamic robot routing optimization: State-space decomposition for operations research-informed reinforcement learning
Loeppenberg, Marlon
Yuwono, Steve
Diprasetya, Mochammad Rizky
Schwung, Andreas
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2024, 90

← 1 2 3 4 5 →