Adaptive dynamic programming

被引:496
|
作者
Murray, JJ [1 ]
Cox, CJ
Lendaris, GG
Saeks, R
机构
[1] SUNY Stony Brook, Dept Elect Engn, Stony Brook, NY 11790 USA
[2] Accurate Automat Corp, Chattanooga, TN 37421 USA
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS | 2002年 / 32卷 / 02期
基金
美国国家科学基金会; 美国国家航空航天局;
关键词
adaptive control; adaptive critic; dynamic programming; nonlinear control; optimal control;
D O I
10.1109/TSMCC.2002.801727
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unlike the many soft computing applications where it suffices to achieve a "good approximation most of the time," a control system must be stable all of the time. As such, if one desires to learn a control law in real-time, a fusion of soft computing techniques to learn the appropriate control law with hard computing techniques to maintain the stability constraint and guarantee convergence is required. The objective of the present paper is to describe an adaptive dynamic programming algorithm (ADPA) which fuses soft computing techniques to learn the optimal cost (or return) functional for a stabilizable nonlinear system with unknown dynamics and hard computing techniques to verify the stability and convergence of the algorithm. Specifically, the algorithm is initialized with a (stabilizing) cost functional and the system is run with the corresponding control law (defined by the Hamilton-Jacobi-Bellman equation), with the resultant state trajectories used to update the cost functional in a soft computing mode. Hard computing techniques are then used to show that this process is globally convergent with stepwise stability to the optimal cost functional/control law pair for an (unknown) input affine system with an input quadratic performance measure (modulo the appropriate technical conditions). Three specific implementations of the ADPA are developed for 1) the linear case, 2) for the nonlinear case using a locally quadratic approximation to the cost functional, and 3) the nonlinear case using a radial basis function approximation of the cost functional; illustrated by applications to flight control.
引用
收藏
页码:140 / 153
页数:14
相关论文
共 50 条
  • [1] The adaptive dynamic programming theorem
    Murray, JJ
    Cox, CJ
    Saeks, RE
    STABILITY AND CONTROL OF DYNAMICAL SYSTEMS WITH APPLICATIONS: A TRIBUTE TO ANTHONY N. MICHEL, 2003, : 379 - 394
  • [2] The Adaptive Dynamic Programming Toolbox
    Xing, Xiaowei
    Chang, Dong Eui
    SENSORS, 2021, 21 (16)
  • [3] Adaptive Dynamic Programming: An Introduction
    Wang, Fei-Yue
    Zhang, Huaguang
    Liu, Derong
    IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2009, 4 (02) : 39 - 47
  • [4] A Retrospective on Adaptive Dynamic Programming for Control
    Lendaris, George G.
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 945 - 952
  • [5] Clipping in Neurocontrol by Adaptive Dynamic Programming
    Fairbank, Michael
    Prokhorov, Danil
    Alonso, Eduardo
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (10) : 1909 - 1920
  • [6] A dynamic programming approach to adaptive fractionation
    Ramakrishnan, Jagdish
    Craft, David
    Bortfeld, Thomas
    Tsitsiklis, John N.
    PHYSICS IN MEDICINE AND BIOLOGY, 2012, 57 (05): : 1203 - 1216
  • [7] Adaptive Dynamic Programming for Feedback Control
    Lewis, Frank L.
    Vrabie, Draguna
    ASCC: 2009 7TH ASIAN CONTROL CONFERENCE, VOLS 1-3, 2009, : 1402 - 1409
  • [8] Integration of Fuzzy Controller with Adaptive Dynamic Programming
    Zhu, Yuanheng
    Zhao, Dongbin
    He, Haibo
    PROCEEDINGS OF THE 10TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2012), 2012, : 310 - 315
  • [9] Dynamic control of adaptive parameters in evolutionary programming
    Liang, KH
    Yao, X
    Newton, C
    SIMULATED EVOLUTION AND LEARNING, 1999, 1585 : 42 - 49
  • [10] Adaptive dynamic programming as a theory of sensorimotor control
    Yu Jiang
    Zhong-Ping Jiang
    Biological Cybernetics, 2014, 108 : 459 - 473