Fitted Q-Iteration via Max-Plus-Linear Approximation

被引:0
|
作者
Liu, Yichen [1 ]
Kolarijani, Mohamad Amin Sharifi [1 ]
机构
[1] Delft Univ Technol, Delft Ctr Syst & Control, NL-2628 CD Delft, Netherlands
来源
IEEE CONTROL SYSTEMS LETTERS | 2024年 / 8卷
关键词
Approximation algorithms; Convergence; Vectors; Standards; Optimal control; Complexity theory; Algebra; Real-time systems; Neural networks; Medical services; Reinforcement learning; stochastic optimal control; computational methods;
D O I
10.1109/LCSYS.2024.3520060
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this letter, we consider the application of max-plus-linear approximators for Q-function in offline reinforcement learning of discounted Markov decision processes. In particular, we incorporate these approximators to propose novel fitted Q-iteration (FQI) algorithms with provable convergence. Exploiting the compatibility of the Bellman operator with max-plus operations, we show that the max-plus-linear regression within each iteration of the proposed FQI algorithm reduces to simple max-plus matrix-vector multiplications. We also consider the variational implementation of the proposed algorithm which leads to a per-iteration complexity that is independent of the number of samples.
引用
收藏
页码:3201 / 3206
页数:6
相关论文
共 14 条
  • [1] Logically-Constrained Neural Fitted Q-iteration
    Hasanbeig, Mohammadhosein
    Abate, Alessandro
    Kroening, Daniel
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2012 - 2014
  • [2] CFQI: Fitted Q-Iteration with Complex Returns
    Wright, Robert
    Qiao, Xingye
    Yu, Lei
    Loscalzo, Steven
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 163 - 170
  • [3] Fitted Q-iteration by Functional Networks for control problems
    Gaeta, Matteo
    Loia, Vincenzo
    Miranda, Sergio
    Tomasiello, Stefania
    APPLIED MATHEMATICAL MODELLING, 2016, 40 (21-22) : 9183 - 9196
  • [4] Fitted Q-iteration and functional networks for ubiquitous recommender systems
    Gaeta, Matteo
    Orciuoli, Francesco
    Rarita, Luigi
    Tomasiello, Stefania
    SOFT COMPUTING, 2017, 21 (23) : 7067 - 7075
  • [5] Finite Abstractions of Max-Plus-Linear Systems
    Adzkiya, Dieky
    De Schutter, Bart
    Abate, Alessandro
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2013, 58 (12) : 3039 - 3053
  • [6] Scenario-based fitted Q-iteration for adaptive of water reservoir system under uncertainty
    Bertoni, Federica
    Giuliani, Matteo
    Castelletti, Andrea
    IFAC PAPERSONLINE, 2017, 50 (01): : 3183 - 3188
  • [7] Computational techniques for reachability analysis of Max-Plus-Linear systems
    Adzkiya, Dieky
    De Schutter, Bart
    Abate, Alessandro
    AUTOMATICA, 2015, 53 : 293 - 302
  • [8] Tree-based fitted Q-iteration for multi-objective Markov decision processes in water resource management
    Pianosi, F.
    Castelletti, A.
    Restelli, M.
    JOURNAL OF HYDROINFORMATICS, 2013, 15 (02) : 258 - 270
  • [9] Finite-horizon min-max control of max-plus-linear systems
    Necoara, Ion
    Kerrigan, Eric C.
    De Schutter, Bart
    van den Boom, Ton J. J.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2007, 52 (06) : 1088 - 1093
  • [10] Linear Fitted-Q Iteration with Multiple Reward Functions
    Lizotte, Daniel J.
    Bowling, Michael
    Murphy, Susan A.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 3253 - 3295