A UNIFIED FRAMEWORK FOR LINEAR FUNCTION APPROXIMATION OF VALUE FUNCTIONS IN STOCHASTIC CONTROL

被引:0
|
作者
Sanchez-Fernandez, Matilde [1 ]
Valcarcel, Sergio [2 ]
Zazo, Santiago [2 ]
机构
[1] Univ Carlos III Madrid, Signal Theory & Commun Dept, Av La Univ 30, Leganes 28911, Spain
[2] Univ Politecn Madrid, Signals Syst & Radiocommun Dept, E-28040 Madrid, Spain
来源
2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2013年
关键词
Approximate dynamic programming; Linear value function approximation; Mean squared Bellman Error; Mean squared projected Bellman Error; Reinforcement Learning;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper contributes with a unified formulation that merges previous analysis on the prediction of the performance (value function) of certain sequence of actions (policy) when an agent operates a Markov decision process with large state-space. When the states are represented by features and the value function is linearly approximated, our analysis reveals a new relationship between two common cost functions used to obtain the optimal approximation. In addition, this analysis allows us to propose an efficient adaptive algorithm that provides an unbiased linear estimate. The performance of the proposed algorithm is illustrated by simulation, showing competitive results when compared with the state-of-the-art solutions.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] A tutorial on value function approximation for stochastic and dynamic transportation
    Heinold, Arne
    4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH, 2024, 22 (01): : 145 - 173
  • [2] A tutorial on value function approximation for stochastic and dynamic transportation
    Arne Heinold
    4OR, 2024, 22 : 145 - 173
  • [3] A unified framework for stochastic optimization
    Powell, Warren B.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2019, 275 (03) : 795 - 821
  • [4] A Clustering-Based Graph Laplacian Framework for Value Function Approximation in Reinforcement Learning
    Xu, Xin
    Huang, Zhenhua
    Graves, Daniel
    Pedrycz, Witold
    IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (12) : 2613 - 2625
  • [5] Control of a Water Tank System with Value Function Approximation
    Lalvani, Shamal
    Katsaggelos, Aggelos
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2023, PT I, 2023, 675 : 36 - 44
  • [6] A Unified Framework for Factorizing Distributional Value Functions for Multi-Agent Reinforcement Learning
    Sun, Wei-Fang
    Lee, Cheng-Kuang
    See, Simon
    Lee, Chun-Yi
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [7] Adaptive value function approximation for continuous-state stochastic dynamic programming
    Fan, Huiyuan
    Tarun, Prashant K.
    Chen, Victoria C. P.
    COMPUTERS & OPERATIONS RESEARCH, 2013, 40 (04) : 1076 - 1084
  • [8] Continuous Control With Swarm Intelligence Based Value Function Approximation
    Wang, Bi
    Li, Xuelian
    Chen, Yang
    Wu, Jianqing
    Zeng, Bowen
    Chen, Junfu
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (01) : 976 - 988
  • [9] Differential TD Learning for Value Function Approximation
    Devraj, Adithya M.
    Meyn, Sean P.
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 6347 - 6354
  • [10] Value Function is All You Need: A Unified Learning Framework for Ride Hailing Platforms
    Tang, Xiaocheng
    Zhang, Fan
    Qin, Zhiwei
    Wang, Yansheng
    Shi, Dingyuan
    Song, Bingchen
    Tong, Yongxin
    Zhu, Hongtu
    Ye, Jieping
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3605 - 3615