State-Space Closure: Revisiting Endless Online Level Generation via Reinforcement Learning

被引：0

作者：

Wang, Ziqi ^{[1
,2
]}

Shu, Tianye ^{[1
,2
]}

Liu, Jialin ^{[1
,2
]}

机构：

[1] Southern Univ Sci & Technol SUSTech, Dept Comp Sci & Engn, Guangdong Prov Key Lab Brain Inspired Intelligent, Shenzhen 518055, Peoples R China

[2] Southern Univ Sci & Technol SUSTech, Res Inst Trustworthy Autonomous Syst, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON GAMES | 2024年 / 16卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Games; Training; Reinforcement learning; Generators; Deep learning; Visualization; Hamming distances; Content diversity; online level generation (OLG); platformer games; procedural content generation (PCG); PCG via reinforcement learning (RL);

D O I：

10.1109/TG.2023.3262297

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this letter, we revisit endless online level generation with the recently proposed experience-driven procedural content generation via reinforcement learning (EDRL) framework. Inspired by an observation that EDRL tends to generate recurrent patterns, we formulate a notion of state-space closure, which makes any stochastic state appeared possibly in an infinite-horizon online generation process, that can be found within a finite horizon. Through theoretical analysis, we find that even though state-space closure arises a concern about diversity, it generalizes EDRL trained with a finite horizon to the infinite-horizon scenario without deterioration of content quality. Moreover, we verify the quality and the diversity of contents generated by EDRL via empirical studies on the widely used Super Mario Bros. benchmark. Experimental results reveal that the diversity of levels generated by EDRL is limited due to the state-space closure, whereas their quality does not deteriorate in a horizon that is longer than the one specified in the training. Concluding our outcomes and analysis, future work on endless online level generation via reinforcement learning should address the issue of diversity while assuring the occurrence of state-space closure and quality.

引用

页码：489 / 492

页数：4

共 50 条

[21] Learning Aerial Docking via Offline-to-Online Reinforcement Learning
Tao, Yang
Feng Yuting
Yu, Yushu
2024 4TH INTERNATIONAL CONFERENCE ON COMPUTER, CONTROL AND ROBOTICS, ICCCR 2024, 2024, : 305 - 309
[22] Online reinforcement learning for a continuous space system with experimental validation
Dogru, Oguzhan
Wieczorek, Nathan
Velswamy, Kirubakaran
Ibrahim, Fadi
Huang, Biao
JOURNAL OF PROCESS CONTROL, 2021, 104 (104) : 86 - 100
[23] Reduction of state space in reinforcement learning by sensor selection
Kishima, Yasutaka
Kurashige, Kentarou
ARTIFICIAL LIFE AND ROBOTICS, 2013, 18 (1-2) : 7 - 14
[24] Constructivist Approach to State Space Adaptation in Reinforcement Learning
Guerian, Maxime
Cardozo, Nicolas
Dusparic, Ivana
2019 IEEE 13TH INTERNATIONAL CONFERENCE ON SELF-ADAPTIVE AND SELF-ORGANIZING SYSTEMS (SASO), 2019, : 52 - 61
[25] Reinforcement learning for energy management of an islanded microgrid: Analysis of state, state space and reinforcement function
Jayaraj, Saritha
Ahamed, T. P. Imthias
Abraham, Mathew P.
Jasmin, E. A.
Sulthan, Sheik Mohammed
SUSTAINABLE ENERGY GRIDS & NETWORKS, 2024, 38
[26] Scaling Representation Learning From Ubiquitous ECG With State-Space Models
Avramidis, Kleanthis
Kunc, Dominika
Perz, Bartosz
Adsul, Kranti
Feng, Tiantian
Kazienko, Przemyslaw
Saganowski, Stanislaw
Narayanan, Shrikanth
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (10) : 5877 - 5889
[27] Online Personalization of Compression in Hearing Aids via Maximum Likelihood Inverse Reinforcement Learning
Akbarzadeh, Sara
Lobarinas, Edward
Kehtarnavaz, Nasser
IEEE ACCESS, 2022, 10 : 58537 - 58546
[28] Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization
Lampton A.
Valasek J.
Kumar M.
Journal of Control Theory and Applications, 2011, 9 (3): : 431 - 439
[29] Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization
Amanda LAMPTON
John VALASEK
Mrinal KUMAR
Journal of Control Theory and Applications, 2011, 9 (03) : 431 - 439
[30] State-Space Compression for Efficient Policy Learning in Crude Oil Scheduling
Ma, Nan
Li, Hongqi
Liu, Hualin
MATHEMATICS, 2024, 12 (03)

← 1 2 3 4 5 →