Approximate Dynamic Programming Recurrence Relations for a Hybrid Optimal Control Problem

被引：0

作者：

Lu, W. ^{[1
]}

Ferrari, S. ^{[1
]}

Fierro, R. ^{[2
]}

Wettergren, T. A. ^{[3
]}

机构：

[1] Duke Univ, Dept Mech Engn & Mat Sci, LISC, Durham, NC 27706 USA

[2] Univ New Mexico, Dept Elect & Comp Engn, Multi Agent Robot Hybrid & Embedded Syst Lab, Albuquerque, NM 87131 USA

[3] Naval Undersea Warfare Ctr, Newport, RI USA

来源：

UNMANNED SYSTEMS TECHNOLOGY XIV | 2012年 / 8387卷

关键词：

Approximate dynamic programming (ADP); hybrid systems; optimal control; FRAMEWORK;

D O I：

10.1117/12.919286

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a hybrid approximate dynamic programming (ADP) method for a hybrid dynamic system (HDS) optimal control problem, that occurs in many complex unmanned systems which are implemented via a hybrid architecture, regarding robot modes or the complex environment. The HDS considered in this paper is characterized by a well-known three-layer hybrid framework, which includes a discrete event controller layer, a discrete-continuous interface layer, and a continuous state layer. The hybrid optimal control problem (HOCP) is to find the optimal discrete event decisions and the optimal continuous controls subject to a deterministic minimization of a scalar function regarding the system state and control over time. Due to the uncertainty of environment and complexity of the HOCP, the cost-to-go cannot be evaluated before the HDS explores the entire system state space; as a result, the optimal control, neither continuous nor discrete, is not available ahead of time. Therefore, ADP is adopted to learn the optimal control while the HDS is exploring the environment, because of the online advantage of ADP method. Furthermore, ADP can break the curses of dimensionality which other optimizing methods, such as dynamic programming (DP) and Markov decision process (MDP), are facing due to the high dimensions of HOCP.

引用

页数：11

共 34 条

[1] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
Al-Tamimi, Asma
Lewis, Frank
[J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 38 - +
[2] [Anonymous], 1999, Neural network control of robot manipulators and nonlinear systems
[3] [Anonymous], 2006, Pattern recognition and machine learning
[4] ROBOT MOTION PLANNING - A DISTRIBUTED REPRESENTATION APPROACH
BARRAQUAND, J
LATOMBE, JC
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 1991, 10 (06) : 628 - 649
[5] Bellman R.E., 1962, Applied Dynamic Programming
[6] Bertsekas D., 1995, Dynamic Programming and Optimal Control, V2
[7] Bertsekas DP, 1995, DYNAMIC PROGRAMMING, V1
[8] Dynamic programming for constrained optimal control of discrete-time linear hybrid systems
Borrelli, F
Baotic, M
Bemporad, A
Morari, M
[J]. AUTOMATICA, 2005, 41 (10) : 1709 - 1721
[9] A unified framework for hybrid control: Model and optimal control theory
Branicky, MS
Borkar, VS
Mitter, SK
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1998, 43 (01) : 31 - 45
[10] Cassandras Christos, 2006, STOCHASTIC HYBRID SY

← 1 2 3 4 →