A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems

被引:301
作者
Todorov, E [1 ]
Li, WW [1 ]
机构
[1] Univ Calif San Diego, Fac Cognit Sci Dept, La Jolla, CA 92093 USA
来源
ACC: PROCEEDINGS OF THE 2005 AMERICAN CONTROL CONFERENCE, VOLS 1-7 | 2005年
关键词
D O I
10.1109/acc.2005.1469949
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present an iterative Linear-Quadratic-Gaussian method for locally-optimal feedback control of nonlinear stochastic systems subject to control constraints. Previously, similar methods have been restricted to deterministic unconstrained problems with quadratic costs. The new method constructs an affine feedback control law, obtained by minimizing a novel quadratic approximation to the optimal cost-to-go function. Global convergence is guaranteed through a Levenberg-Marquardt method; convergence in the vicinity of a local minimum is quadratic. Performance is illustrated on a limited-torque inverted pendulum problem, as well as a complex biomechanical control problem involving a stochastic model of the human arm, with 10 state dimensions and 6 muscle actuators. A Matlab implementation of the new algorithm is availabe at www.cogsci.ucsd.edu/similar to todorov.
引用
收藏
页码:300 / 306
页数:7
相关论文
共 17 条
[1]  
Bertsekas D., 1996, NEURO DYNAMIC PROGRA, V1st
[2]   Measured and modeled properties of mammalian skeletal muscle. II. The effects of stimulus frequency on force-length and force-velocity relationships [J].
Brown, IE ;
Cheng, EJ ;
Loeb, GE .
JOURNAL OF MUSCLE RESEARCH AND CELL MOTILITY, 1999, 20 (07) :627-643
[3]  
Bryson A. E., 1969, Applied Optimal Control: Optimization, Estimation, and Control
[4]   Signal-dependent noise determines motor planning [J].
Harris, CM ;
Wolpert, DM .
NATURE, 1998, 394 (6695) :780-784
[5]  
Jacobson D. H., 1970, Differential Dynamic Programming. American
[6]   The loss function of sensorimotor learning [J].
Körding, KP ;
Wolpert, DM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (26) :9839-9842
[7]  
Kushner HJ., 2001, Numerical methods for stochastic control problems in continuous time
[8]  
Liao L.Z., 1993, ADVANTAGES DIFFERENT
[9]   CONVERGENCE IN UNCONSTRAINED DISCRETE-TIME DIFFERENTIAL DYNAMIC-PROGRAMMING [J].
LIAO, LZ ;
SHOEMAKER, CA .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1991, 36 (06) :692-706
[10]  
NH C, 2002, J GLOBAL OPTIM, V23, P401