A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems

被引：301

作者：

Todorov, E ^{[1
]}

Li, WW ^{[1
]}

机构：

[1] Univ Calif San Diego, Fac Cognit Sci Dept, La Jolla, CA 92093 USA

来源：

ACC: PROCEEDINGS OF THE 2005 AMERICAN CONTROL CONFERENCE, VOLS 1-7 | 2005年

关键词：

D O I：

10.1109/acc.2005.1469949

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present an iterative Linear-Quadratic-Gaussian method for locally-optimal feedback control of nonlinear stochastic systems subject to control constraints. Previously, similar methods have been restricted to deterministic unconstrained problems with quadratic costs. The new method constructs an affine feedback control law, obtained by minimizing a novel quadratic approximation to the optimal cost-to-go function. Global convergence is guaranteed through a Levenberg-Marquardt method; convergence in the vicinity of a local minimum is quadratic. Performance is illustrated on a limited-torque inverted pendulum problem, as well as a complex biomechanical control problem involving a stochastic model of the human arm, with 10 state dimensions and 6 muscle actuators. A Matlab implementation of the new algorithm is availabe at www.cogsci.ucsd.edu/similar to todorov.

引用

页码：300 / 306

页数：7

共 17 条

[1]

Bertsekas D., 1996, NEURO DYNAMIC PROGRA, V1st

[2] Measured and modeled properties of mammalian skeletal muscle. II. The effects of stimulus frequency on force-length and force-velocity relationships [J].

Brown, IE ;

Cheng, EJ ;

Loeb, GE .

JOURNAL OF MUSCLE RESEARCH AND CELL MOTILITY, 1999, 20 (07) :627-643

[3]

Bryson A. E., 1969, Applied Optimal Control: Optimization, Estimation, and Control

[4] Signal-dependent noise determines motor planning [J].

Harris, CM ;

Wolpert, DM .

NATURE, 1998, 394 (6695) :780-784

[5]

Jacobson D. H., 1970, Differential Dynamic Programming. American

[6] The loss function of sensorimotor learning [J].

Körding, KP ;

Wolpert, DM .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (26) :9839-9842

[7]

Kushner HJ., 2001, Numerical methods for stochastic control problems in continuous time

[8]

Liao L.Z., 1993, ADVANTAGES DIFFERENT

[9] CONVERGENCE IN UNCONSTRAINED DISCRETE-TIME DIFFERENTIAL DYNAMIC-PROGRAMMING [J].

LIAO, LZ ;

SHOEMAKER, CA .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1991, 36 (06) :692-706

[10]

NH C, 2002, J GLOBAL OPTIM, V23, P401

← 1 2 →