Topological Visualization Method for Understanding the Landscape of Value Functions and Structure of the State Space in Reinforcement Learning

被引：1

作者：

Nakamura, Yuki ^{[1
]}

Shibuya, Takeshi ^{[2
]}

机构：

[1] Univ Tsukuba, Grad Sch Syst & Informat Engn, 1-1-1 Tennodai, Tsukuba, Ibaraki, Japan

[2] Univ Tsukuba, Fac Engn Informat & Syst, 1-1-1 Tennodai, Tsukuba, Ibaraki, Japan

来源：

ICAART: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2 | 2020年

关键词：

Reinforcement Learning; Topological Data Analysis; TDA Mapper; Visualization;

D O I：

10.5220/0008913303700377

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning is a learning framework applied in various fields in which agents autonomously acquire control rules. Using this method, the designer constructs a state space and reward function and sets various parameters to obtain ideal performance. The actual performance of the agent depends on the design. Accordingly, a poor design causes poor performance. In that case, the designer needs to examine the cause of the poor performance; to do so, it is important for the designer to understand the current agent control rules. In the case where the state space is less than or equal to two dimensions, visualizing the landscape of the value function and the structure of the state space is the most powerful method to understand these rules. However, in other cases, there is no method for such a visualization. In this paper, we propose a method to visualize the landscape of the value function and the structure of the state space even when the state space has a high number of dimensions. Concretely, we employ topological data analysis for the visualization. We confirm the effectiveness of the proposed method via several numerical experiments.

引用

页码：370 / 377

页数：8

共 50 条

[31] Swarm Reinforcement Learning Methods for Problems with Continuous State-Action Space
Iima, Hitoshi
Kuroe, Yasuaki
Emoto, Kazuo
2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 2173 - 2180
[32] BEHAVIOR ACQUISITION ON A MOBILE ROBOT USING REINFORCEMENT LEARNING WITH CONTINUOUS STATE SPACE
Arai, Tomoyuki
Toda, Yuichiro
Kubota, Naoyuki
PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2019, : 458 - 461
[33] A novel reinforcement learning-based method for structure optimization
Mei, Zijian
Yang, Zhouwang
Chen, Jingrun
ENGINEERING OPTIMIZATION, 2024,
[34] Constructing Continuous Action Space from Basis Functions for Fast and Stable Reinforcement Learning
Yamaguchi, Akihiko
Takamatsu, Jun
Ogasawara, Tsukasa
RO-MAN 2009: THE 18TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, VOLS 1 AND 2, 2009, : 718 - 724
[35] A Method of Role Differentiation Using a State Space Filter with a Waveform Changing Parameter in Multi-agent Reinforcement Learning
Nagayoshi, Masato
Elderton, Simon
Tamaki, Hisashi
PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB 2021), 2021, : 461 - 464
[36] GDT: Multi-agent reinforcement learning framework based on adaptive grouping dynamic topological space
Sun, Licheng
Ma, Hongbin
Guo, Zhentao
INFORMATION SCIENCES, 2025, 691
[37] A Deep Recurrent-Reinforcement Learning Method for Intelligent AutoScaling of Serverless Functions
Agarwal, Siddharth
Rodriguez, Maria A.
Buyya, Rajkumar
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (05) : 1899 - 1910
[38] Online state space generation by a growing self-organizing map and differential learning for reinforcement learning
Notsu, Akira
Yasuda, Koji
Ubukata, Seiki
Honda, Katsuhiro
APPLIED SOFT COMPUTING, 2020, 97
[39] Integrating Symmetry of Environment by Designing Special Basis functions for Value Function Approximation in Reinforcement Learning
Wang, Guo-fang
Fang, Zhou
Li, Bo
Li, Ping
2016 14TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2016,
[40] Visualization of Learning Process in "State and Action" Space Using Self-Organizing Maps
Notsu, Akira
Hattori, Yuichi
Ubukata, Seiki
Honda, Katsuhiro
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2016, 20 (06) : 983 - 991

← 1 2 3 4 5 →