Volume-weighted Bellman error method for adaptive meshing in approximate dynamic programming

被引:3
作者
Armesto, Leopoldo [1 ]
Sala, Antonio [2 ]
机构
[1] Univ Politecn Valencia, Inst Diseno & Fabricac, Cno Vera S-N, Valencia 46022, Spain
[2] Univ Politecn Valencia, Inst Univ Automat & Informat Ind, Cno Vera S-N, Valencia 46022, Spain
来源
REVISTA IBEROAMERICANA DE AUTOMATICA E INFORMATICA INDUSTRIAL | 2022年 / 19卷 / 01期
关键词
Intelligent control; approximate dynamic programming; optimal control; neural learning;
D O I
10.4995/riai.2021.15698
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Y Optimal control and reinforcement learning have an associate "value function" which must be suitably approximated. Value function approximation problems usually have different precision requirements in different regions of the state space. An uniform gridding wastes resources in regions in which the value function is smooth, and, on the other hand, has not enough resolution in zones with abrupt changes. The present work proposes an adaptive meshing methodology in order to adapt to these changing requirements without incrementing too much the number of parameters of the approximator. The proposal is based on simplicial meshes and Bellman error, with a criteria to add and remove points from the mesh: modifications to proposals in earlier literature including the volume of the affected simplices are proposed, alongside with methods to manipulate the mesh triangulation.
引用
收藏
页码:37 / 47
页数:11
相关论文
共 32 条
[11]   Approximate dynamic programming with a fuzzy parameterization [J].
Busoniu, Lucian ;
Ernst, Damien ;
De Schutter, Bart ;
Babuska, Robert .
AUTOMATICA, 2010, 46 (05) :804-814
[12]  
Camacho E., 2010, REV IBEROAM AUTOM IN, V1, P5
[13]   The linear programming approach to approximate dynamic programming [J].
De Farias, DP ;
Van Roy, B .
OPERATIONS RESEARCH, 2003, 51 (06) :850-865
[14]  
Deisenroth M. P., 2013, Foundations and Trends in Robotics, V2, P1, DOI [10.1561/2300000021, DOI 10.1561/2300000021]
[15]   A LINEAR PROGRAMMING METHODOLOGY FOR APPROXIMATE DYNAMIC PROGRAMMING [J].
Diaz, Henry ;
Sala, Antonio ;
Armesto, Leopoldo .
INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2020, 30 (02) :363-375
[16]   Fitted Q-Function Control Methodology Based on Takagi-Sugeno Systems [J].
Diaz, Henry ;
Armesto, Leopoldo ;
Sala, Antonio .
IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2020, 28 (02) :477-488
[17]   Approximate Dynamic Programming Methodology for Data-based Optimal Controllers [J].
Diaz, Henry ;
Armesto, Leopoldo ;
Sala, Antonio .
REVISTA IBEROAMERICANA DE AUTOMATICA E INFORMATICA INDUSTRIAL, 2019, 16 (03) :273-283
[18]  
Duarte-Mermoud M., 2018, REV IBEROAM AUTOM IN
[19]  
Fairbank M, 2012, IEEE IJCNN
[20]   An adaptive grid scheme for the discrete Hamilton-Jacobi-Bellman equation [J].
Grune, L .
NUMERISCHE MATHEMATIK, 1997, 75 (03) :319-337