Fitted Q-Iteration via Max-Plus-Linear Approximation

被引：0

作者：

Liu, Yichen ^{[1
]}

Kolarijani, Mohamad Amin Sharifi ^{[1
]}

机构：

[1] Delft Univ Technol, Delft Ctr Syst & Control, NL-2628 CD Delft, Netherlands

来源：

IEEE CONTROL SYSTEMS LETTERS | 2024年 / 8卷

关键词：

Approximation algorithms; Convergence; Vectors; Standards; Optimal control; Complexity theory; Algebra; Real-time systems; Neural networks; Medical services; Reinforcement learning; stochastic optimal control; computational methods;

D O I：

10.1109/LCSYS.2024.3520060

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this letter, we consider the application of max-plus-linear approximators for Q-function in offline reinforcement learning of discounted Markov decision processes. In particular, we incorporate these approximators to propose novel fitted Q-iteration (FQI) algorithms with provable convergence. Exploiting the compatibility of the Bellman operator with max-plus operations, we show that the max-plus-linear regression within each iteration of the proposed FQI algorithm reduces to simple max-plus matrix-vector multiplications. We also consider the variational implementation of the proposed algorithm which leads to a per-iteration complexity that is independent of the number of samples.

引用

页码：3201 / 3206

页数：6

共 14 条

[1] Logically-Constrained Neural Fitted Q-iteration
Hasanbeig, Mohammadhosein
Abate, Alessandro
Kroening, Daniel
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2012 - 2014
[2] CFQI: Fitted Q-Iteration with Complex Returns
Wright, Robert
Qiao, Xingye
Yu, Lei
Loscalzo, Steven
PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 163 - 170
[3] Fitted Q-iteration by Functional Networks for control problems
Gaeta, Matteo
Loia, Vincenzo
Miranda, Sergio
Tomasiello, Stefania
APPLIED MATHEMATICAL MODELLING, 2016, 40 (21-22) : 9183 - 9196
[4] Fitted Q-iteration and functional networks for ubiquitous recommender systems
Gaeta, Matteo
Orciuoli, Francesco
Rarita, Luigi
Tomasiello, Stefania
SOFT COMPUTING, 2017, 21 (23) : 7067 - 7075
[5] Finite Abstractions of Max-Plus-Linear Systems
Adzkiya, Dieky
De Schutter, Bart
Abate, Alessandro
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2013, 58 (12) : 3039 - 3053
[6] Scenario-based fitted Q-iteration for adaptive of water reservoir system under uncertainty
Bertoni, Federica
Giuliani, Matteo
Castelletti, Andrea
IFAC PAPERSONLINE, 2017, 50 (01): : 3183 - 3188
[7] Computational techniques for reachability analysis of Max-Plus-Linear systems
Adzkiya, Dieky
De Schutter, Bart
Abate, Alessandro
AUTOMATICA, 2015, 53 : 293 - 302
[8] Tree-based fitted Q-iteration for multi-objective Markov decision processes in water resource management
Pianosi, F.
Castelletti, A.
Restelli, M.
JOURNAL OF HYDROINFORMATICS, 2013, 15 (02) : 258 - 270
[9] Finite-horizon min-max control of max-plus-linear systems
Necoara, Ion
Kerrigan, Eric C.
De Schutter, Bart
van den Boom, Ton J. J.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2007, 52 (06) : 1088 - 1093
[10] Linear Fitted-Q Iteration with Multiple Reward Functions
Lizotte, Daniel J.
Bowling, Michael
Murphy, Susan A.
JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 3253 - 3295

← 1 2 →