A UNIFIED FRAMEWORK FOR LINEAR FUNCTION APPROXIMATION OF VALUE FUNCTIONS IN STOCHASTIC CONTROL

被引:0
|
作者
Sanchez-Fernandez, Matilde [1 ]
Valcarcel, Sergio [2 ]
Zazo, Santiago [2 ]
机构
[1] Univ Carlos III Madrid, Signal Theory & Commun Dept, Av La Univ 30, Leganes 28911, Spain
[2] Univ Politecn Madrid, Signals Syst & Radiocommun Dept, E-28040 Madrid, Spain
来源
2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2013年
关键词
Approximate dynamic programming; Linear value function approximation; Mean squared Bellman Error; Mean squared projected Bellman Error; Reinforcement Learning;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper contributes with a unified formulation that merges previous analysis on the prediction of the performance (value function) of certain sequence of actions (policy) when an agent operates a Markov decision process with large state-space. When the states are represented by features and the value function is linearly approximated, our analysis reveals a new relationship between two common cost functions used to obtain the optimal approximation. In addition, this analysis allows us to propose an efficient adaptive algorithm that provides an unbiased linear estimate. The performance of the proposed algorithm is illustrated by simulation, showing competitive results when compared with the state-of-the-art solutions.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Differentially Private Reinforcement Learning with Linear Function Approximation
    Zhou, Xingyu
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2022, 6 (01)
  • [32] Optimal and instance-dependent guarantees for Markovian linear stochastic approximation
    Mou, Wenlong
    Pananjady, Ashwin
    Wainwright, Martin J.
    Bartlett, Peter L.
    CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
  • [33] A grey approximation approach to state value function in reinforcement learning
    Hwang, Kao-Shing
    Chen, Yu-Jen
    Lee, Guar-Yuan
    2007 IEEE INTERNATIONAL CONFERENCE ON INTEGRATION TECHNOLOGY, PROCEEDINGS, 2007, : 379 - +
  • [34] Distributed Value Function Approximation for Collaborative Multiagent Reinforcement Learning
    Stankovic, Milos S.
    Beko, Marko
    Stankovic, Srdjan S.
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2021, 8 (03): : 1270 - 1280
  • [35] On the convergence of temporal-difference learning with linear function approximation
    Tadic, V
    MACHINE LEARNING, 2001, 42 (03) : 241 - 267
  • [36] On the Convergence of Temporal-Difference Learning with Linear Function Approximation
    Vladislav Tadić
    Machine Learning, 2001, 42 : 241 - 267
  • [37] A unified view of configurable Markov Decision Processes: Solution concepts, value functions, and operators
    Metelli, Alberto Maria
    INTELLIGENZA ARTIFICIALE, 2022, 16 (02) : 165 - 184
  • [38] TENSOR LOW-RANK APPROXIMATION OF FINITE-HORIZON VALUE FUNCTIONS
    Rozada, Sergio
    Marques, Antonio G.
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5975 - 5979
  • [39] Modified value-function-approximation for synchronous policy iteration with single-critic configuration for nonlinear optimal control
    Tang, Difan
    Chen, Lei
    Tian, Zhao Feng
    Hu, Eric
    INTERNATIONAL JOURNAL OF CONTROL, 2021, 94 (05) : 1321 - 1333
  • [40] Using Reinforcement Learning to Control Traffic Signals in a Real-World Scenario: An Approach Based on Linear Function Approximation
    Alegre, Lucas N.
    Ziemke, Theresa
    Bazzan, Ana L. C.
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (07) : 9126 - 9135