Topological Visualization Method for Understanding the Landscape of Value Functions and Structure of the State Space in Reinforcement Learning

被引:1
|
作者
Nakamura, Yuki [1 ]
Shibuya, Takeshi [2 ]
机构
[1] Univ Tsukuba, Grad Sch Syst & Informat Engn, 1-1-1 Tennodai, Tsukuba, Ibaraki, Japan
[2] Univ Tsukuba, Fac Engn Informat & Syst, 1-1-1 Tennodai, Tsukuba, Ibaraki, Japan
来源
ICAART: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2 | 2020年
关键词
Reinforcement Learning; Topological Data Analysis; TDA Mapper; Visualization;
D O I
10.5220/0008913303700377
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning is a learning framework applied in various fields in which agents autonomously acquire control rules. Using this method, the designer constructs a state space and reward function and sets various parameters to obtain ideal performance. The actual performance of the agent depends on the design. Accordingly, a poor design causes poor performance. In that case, the designer needs to examine the cause of the poor performance; to do so, it is important for the designer to understand the current agent control rules. In the case where the state space is less than or equal to two dimensions, visualizing the landscape of the value function and the structure of the state space is the most powerful method to understand these rules. However, in other cases, there is no method for such a visualization. In this paper, we propose a method to visualize the landscape of the value function and the structure of the state space even when the state space has a high number of dimensions. Concretely, we employ topological data analysis for the visualization. We confirm the effectiveness of the proposed method via several numerical experiments.
引用
收藏
页码:370 / 377
页数:8
相关论文
共 50 条
  • [41] A system of autonomous state space construction with a self-organizing map in reinforcement learning
    Iwasaki, H
    Ohki, H
    Sueda, N
    PROCEEDINGS OF THE EIGHTH IASTED INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL, 2005, : 454 - 459
  • [42] Prediction based segmentation of state space and application to a subgoal finding problem in reinforcement learning
    Nagata, Y
    Ohigashi, Y
    Takahashi, H
    Ishikawa, S
    Omori, T
    Morikawa, K
    SICE 2004 ANNUAL CONFERENCE, VOLS 1-3, 2004, : 2560 - 2565
  • [43] A Graph-Based Reinforcement Learning Method with Converged State Exploration and Exploitation
    Li, Han
    Chen, Tianding
    Teng, Hualiang
    Jiang, Yingtao
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2019, 118 (02): : 253 - +
  • [44] A Reinforcement Learning Approach to Understanding Procrastination: Does Inaccurate Value Approximation Cause Irrational Postponing of a Task?
    Feng, Zheyu
    Nagase, Asako Mitsuto
    Morita, Kenji
    FRONTIERS IN NEUROSCIENCE, 2021, 15
  • [45] Policy Gradient Reinforcement Learning Method for Discrete-Time Linear Quadratic Regulation Problem Using Estimated State Value Function
    Sasaki, Tomotake
    Uchibe, Eiji
    Iwane, Hidenao
    Yanami, Hitoshi
    Anai, Hirokazu
    Doya, Kenji
    2017 56TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2017, : 653 - 657
  • [46] The Maintenance of Orbital States in a Floating Partial Space Elevator Using the Reinforcement Learning Method
    Xu, Weili
    Yang, Xuerong
    Shi, Gefei
    AEROSPACE, 2024, 11 (10)
  • [47] Optimal Action Space Search: an Effective Deep Reinforcement Learning Method for Algorithmic Trading
    Duan, Zhongjie
    Chen, Cen
    Cheng, Dawei
    Liang, Yuqi
    Qian, Weining
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 406 - 415
  • [48] A Multi-agent Reinforcement Learning Method for Swarm Robots in Space Collaborative Exploration
    Huang, Yixin
    Wu, Shufan
    Mu, Zhongcheng
    Long, Xiangyu
    Chu, Sunhao
    Zhao, Guohong
    2020 6TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR), 2020, : 139 - 144
  • [49] Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning
    Jakab, Hunor Sandor
    Csato, Lehel
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2013, 2013, 8131 : 170 - 177
  • [50] STATE-ACTION VALUE FUNCTION MODELED BY ELM IN REINFORCEMENT LEARNING FOR HOSE CONTROL PROBLEMS
    Manuel Lopez-Guede, Jose
    Fernandez-Gauna, Borja
    Grana, Manuel
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2013, 21 : 99 - 116