Enhanced Exploration Least-Squares Methods for Optimal Stopping Problems

被引:6
作者
Forootani, Ali [1 ]
Tipaldi, Massimo [2 ]
Iervolino, Raffaele [3 ]
Dey, Subhrakanti [1 ]
机构
[1] Maynooth Univ, Hamilton Inst, Maynooth W23 F2K8, Kildare, Ireland
[2] Univ Sannio, Dept Engn, I-82100 Benevento, Italy
[3] Univ Naples Federico II, Dept Elect Engn & Informat Technol, I-80125 Naples, Italy
来源
IEEE CONTROL SYSTEMS LETTERS | 2022年 / 6卷
关键词
Markov processes; Cost function; Probability distribution; Steady-state; Monte Carlo methods; Mathematical model; Computational modeling; Optimal stopping problem; Markov decision process; approximate dynamic programming; APPROXIMATION; ALGORITHMS;
D O I
10.1109/LCSYS.2021.3069708
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This letter presents an Approximate Dynamic Programming (ADP) least-squares based approach for solving optimal stopping problems with a large state space. By extending some previous work in the area of optimal stopping problems, it provides a framework for their formulation and resolution. The proposed method uses a combined on/off policy exploration mechanism, where states are generated by means of state transition probability distributions different from the ones dictated by the underlying Markov decision processes. The contraction mapping property of the associated projected Bellman operator is analysed as well as the convergence of the resulting algorithm.
引用
收藏
页码:271 / 276
页数:6
相关论文
共 17 条
[1]  
Andrén MT, 2019, 2019 18TH EUROPEAN CONTROL CONFERENCE (ECC), P2832, DOI [10.23919/ecc.2019.8795838, 10.23919/ECC.2019.8795838]
[2]  
[Anonymous], 2012, Athena Scientific
[3]   Consistent Event-Triggered Control for Discrete-Time Linear Systems With Partial State Information [J].
Antunes, Duarte J. ;
Balaghi, M. H., I .
IEEE CONTROL SYSTEMS LETTERS, 2020, 4 (01) :181-186
[4]   Temporal Difference Methods for General Projected Equations [J].
Bertsekas, Dimitri P. .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2011, 56 (09) :2128-2139
[5]  
Forootani A., 2020, IFAC J SYST CONTROL, V13, P1
[6]   Approximate dynamic programming for stochastic resource allocation problems [J].
Forootani, Ali ;
Iervolino, Raffaele ;
Tipaldi, Massimo ;
Neilson, Joshua .
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2020, 7 (04) :975-990
[7]   Applying unweighted least-squares based techniques to stochastic dynamic programming: theory and application [J].
Forootani, Ali ;
Iervolino, Raffaele ;
Tipaldi, Massimo .
IET CONTROL THEORY AND APPLICATIONS, 2019, 13 (15) :2387-2398
[8]   Algorithmic Survey of Parametric Value Function Approximation [J].
Geist, Matthieu ;
Pietquin, Olivier .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (06) :845-867
[9]   On minimum-variance event-triggered control [J].
Goldenshluger, Alexander ;
Mirkin, Leonid .
IEEE Control Systems Letters, 2017, 1 (01) :32-37
[10]  
Huizhen Yu, 2007, Proceedings of the European Control Conference 2007 (ECC), P2368