Topological Visualization Method for Understanding the Landscape of Value Functions and Structure of the State Space in Reinforcement Learning

被引：1

作者：

Nakamura, Yuki ^{[1
]}

Shibuya, Takeshi ^{[2
]}

机构：

[1] Univ Tsukuba, Grad Sch Syst & Informat Engn, 1-1-1 Tennodai, Tsukuba, Ibaraki, Japan

[2] Univ Tsukuba, Fac Engn Informat & Syst, 1-1-1 Tennodai, Tsukuba, Ibaraki, Japan

来源：

ICAART: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2 | 2020年

关键词：

Reinforcement Learning; Topological Data Analysis; TDA Mapper; Visualization;

D O I：

10.5220/0008913303700377

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning is a learning framework applied in various fields in which agents autonomously acquire control rules. Using this method, the designer constructs a state space and reward function and sets various parameters to obtain ideal performance. The actual performance of the agent depends on the design. Accordingly, a poor design causes poor performance. In that case, the designer needs to examine the cause of the poor performance; to do so, it is important for the designer to understand the current agent control rules. In the case where the state space is less than or equal to two dimensions, visualizing the landscape of the value function and the structure of the state space is the most powerful method to understand these rules. However, in other cases, there is no method for such a visualization. In this paper, we propose a method to visualize the landscape of the value function and the structure of the state space even when the state space has a high number of dimensions. Concretely, we employ topological data analysis for the visualization. We confirm the effectiveness of the proposed method via several numerical experiments.

引用

页码：370 / 377

页数：8

共 50 条

[41] A system of autonomous state space construction with a self-organizing map in reinforcement learning
Iwasaki, H
Ohki, H
Sueda, N
PROCEEDINGS OF THE EIGHTH IASTED INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL, 2005, : 454 - 459
[42] Prediction based segmentation of state space and application to a subgoal finding problem in reinforcement learning
Nagata, Y
Ohigashi, Y
Takahashi, H
Ishikawa, S
Omori, T
Morikawa, K
SICE 2004 ANNUAL CONFERENCE, VOLS 1-3, 2004, : 2560 - 2565
[43] A Graph-Based Reinforcement Learning Method with Converged State Exploration and Exploitation
Li, Han
Chen, Tianding
Teng, Hualiang
Jiang, Yingtao
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2019, 118 (02): : 253 - +
[44] A Reinforcement Learning Approach to Understanding Procrastination: Does Inaccurate Value Approximation Cause Irrational Postponing of a Task?
Feng, Zheyu
Nagase, Asako Mitsuto
Morita, Kenji
FRONTIERS IN NEUROSCIENCE, 2021, 15
[45] Policy Gradient Reinforcement Learning Method for Discrete-Time Linear Quadratic Regulation Problem Using Estimated State Value Function
Sasaki, Tomotake
Uchibe, Eiji
Iwane, Hidenao
Yanami, Hitoshi
Anai, Hirokazu
Doya, Kenji
2017 56TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2017, : 653 - 657
[46] The Maintenance of Orbital States in a Floating Partial Space Elevator Using the Reinforcement Learning Method
Xu, Weili
Yang, Xuerong
Shi, Gefei
AEROSPACE, 2024, 11 (10)
[47] Optimal Action Space Search: an Effective Deep Reinforcement Learning Method for Algorithmic Trading
Duan, Zhongjie
Chen, Cen
Cheng, Dawei
Liang, Yuqi
Qian, Weining
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 406 - 415
[48] A Multi-agent Reinforcement Learning Method for Swarm Robots in Space Collaborative Exploration
Huang, Yixin
Wu, Shufan
Mu, Zhongcheng
Long, Xiangyu
Chu, Sunhao
Zhao, Guohong
2020 6TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR), 2020, : 139 - 144
[49] Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning
Jakab, Hunor Sandor
Csato, Lehel
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2013, 2013, 8131 : 170 - 177
[50] STATE-ACTION VALUE FUNCTION MODELED BY ELM IN REINFORCEMENT LEARNING FOR HOSE CONTROL PROBLEMS
Manuel Lopez-Guede, Jose
Fernandez-Gauna, Borja
Grana, Manuel
INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2013, 21 : 99 - 116

← 1 2 3 4 5 →