Volume-weighted Bellman error method for adaptive meshing in approximate dynamic programming

被引：3

作者：

Armesto, Leopoldo ^{[1
]}

Sala, Antonio ^{[2
]}

机构：

[1] Univ Politecn Valencia, Inst Diseno & Fabricac, Cno Vera S-N, Valencia 46022, Spain

[2] Univ Politecn Valencia, Inst Univ Automat & Informat Ind, Cno Vera S-N, Valencia 46022, Spain

来源：

REVISTA IBEROAMERICANA DE AUTOMATICA E INFORMATICA INDUSTRIAL | 2022年 / 19卷 / 01期

关键词：

Intelligent control; approximate dynamic programming; optimal control; neural learning;

D O I：

10.4995/riai.2021.15698

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Y Optimal control and reinforcement learning have an associate "value function" which must be suitably approximated. Value function approximation problems usually have different precision requirements in different regions of the state space. An uniform gridding wastes resources in regions in which the value function is smooth, and, on the other hand, has not enough resolution in zones with abrupt changes. The present work proposes an adaptive meshing methodology in order to adapt to these changing requirements without incrementing too much the number of parameters of the approximator. The proposal is based on simplicial meshes and Bellman error, with a criteria to add and remove points from the mesh: modifications to proposals in earlier literature including the volume of the affected simplices are proposed, alongside with methods to manipulate the mesh triangulation.

引用

页码：37 / 47

页数：11

共 32 条

[1]

Albertos P., 2006, Multivariable control systems: an engineering approach

[2]

Allgower F, 2012, Nonlinear model predictive control

[3] Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path [J].

Antos, Andras ;

Szepesvari, Csaba ;

Munos, Remi .

MACHINE LEARNING, 2008, 71 (01) :89-129

[4]

Ariño C, 2014, IEEE INT FUZZY SYST, P2288, DOI 10.1109/FUZZ-IEEE.2014.6891633

[5] Guaranteed cost control analysis and iterative design for constrained Takagi-Sugeno systems [J].

Arino, Carlos ;

Perez, Emilio ;

Sala, Antonio .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2010, 23 (08) :1420-1427

[6] Duality-Based Nonlinear Quadratic Control: Application to Mobile Robot Trajectory-Following [J].

Armesto, Leopoldo ;

Girbes, Vicent ;

Sala, Antonio ;

Zima, Miroslav ;

Smidl, Vaclav .

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2015, 23 (04) :1494-1504

[7]

Athans M., 2013, OPTIMAL CONTROL INTR

[8]

Bertsekas D. P., 1996, Neuro-Dynamic Programming

[9]

Bertsekas D. P., 2018, ABSTRACT DYNAMIC PRO

[10]

Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f

← 1 2 3 4 →