An Efficient Impulsive Adaptive Dynamic Programming Algorithm for Stochastic Systems

被引:11
|
作者
Liang, Mingming [1 ]
Wang, Yonghua [1 ]
Liu, Derong [1 ,2 ]
机构
[1] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China
[2] Univ Illinois, Dept Elect & Comp Engn, Chicago, IL 60607 USA
基金
中国国家自然科学基金;
关键词
Heuristic algorithms; Stochastic systems; Approximation algorithms; Aerospace electronics; Markov processes; Dynamic programming; Probability distribution; Adaptive dynamic programming (ADP); impulsive stochastic systems; optimal control; policy iteration; transition matrix; POLICY ITERATION; ROLLOUT;
D O I
10.1109/TCYB.2022.3158898
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this study, a novel general impulsive transition matrix is defined, which can reveal the transition dynamics and probability distribution evolution patterns for all system states between two impulsive ``events,'' instead of two regular time indexes. Based on this general matrix, the policy iteration-based impulsive adaptive dynamic programming (IADP) algorithm along with its variant, which is a more efficient IADP (EIADP) algorithm, are developed in order to solve the optimal impulsive control problems of discrete stochastic systems. Through analyzing the monotonicity, stability, and convergency properties of the obtained iterative value functions and control laws, it is proved that the IADP and EIADP algorithms both converge to the optimal impulsive performance index function. By dividing the whole impulsive policy into smaller pieces, the proposed EIADP algorithm updates the iterative policies in a ``piece-by-piece'' manner according to the actual hardware constraints. This feature of the EIADP method enables these ADP-based algorithms to be fully optimized to run on all ``sizes'' of computing devices including the ones with low memory spaces. A simulation experiment is conducted to validate the effectiveness of the present methods.
引用
收藏
页码:5545 / 5559
页数:15
相关论文
共 50 条
  • [1] Pareto optimal control of the mean-field stochastic systems by adaptive dynamic programming algorithm
    Ge, Yingying
    Liu, Xikui
    Li, Yan
    ISA TRANSACTIONS, 2020, 102 (102) : 81 - 90
  • [2] An adaptive dynamic programming algorithm for a stochastic multiproduct batch dispatch problem
    Papadaki, KP
    Powell, WB
    NAVAL RESEARCH LOGISTICS, 2003, 50 (07) : 742 - 769
  • [3] Liquid-Updating Impulsive Adaptive Dynamic Programming for Continuous Nonlinear Systems
    Liang, Mingming
    Liu, Derong
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (02): : 716 - 728
  • [4] Designing a Stochastic Adaptive Impulsive Observer for Stochastic Linear and Nonlinear Impulsive Systems
    Ayati, Moosa
    Alwan, Mohamad
    Liu, Xinzhi
    Khaloozadeh, Hamid
    ADVANCES IN MATHEMATICAL AND COMPUTATIONAL METHODS: ADDRESSING MODERN CHALLENGES OF SCIENCE, TECHNOLOGY, AND SOCIETY, 2011, 1368
  • [5] An asymptotically efficient algorithm for finite horizon stochastic dynamic programming problems
    Chang, HS
    Fu, MC
    Marcus, SI
    42ND IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-6, PROCEEDINGS, 2003, : 3818 - 3823
  • [6] Adaptive Dynamic Programming for Stochastic Systems With State and Control Dependent Noise
    Bian, Tao
    Jiang, Yu
    Jiang, Zhong-Ping
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) : 4170 - 4175
  • [7] Discrete-Time Impulsive Adaptive Dynamic Programming
    Wei, Qinglai
    Song, Ruizhuo
    Liao, Zehua
    Li, Benkai
    Lewis, Frank L.
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (10) : 4293 - 4306
  • [8] Efficient Parallelization of the Stochastic Dual Dynamic Programming Algorithm Applied to Hydropower Scheduling
    Helseth, Arild
    Braaten, Hallvard
    ENERGIES, 2015, 8 (12): : 14287 - 14297
  • [9] SMALL STOCHASTIC IMPULSIVE PERTURBATIONS OF DYNAMIC SYSTEMS
    GRIN, AG
    TEORIYA VEROYATNOSTEI I YEYE PRIMENIYA, 1975, 20 (01): : 150 - 158
  • [10] Robust adaptive dynamic programming for continuous-time linear stochastic systems
    Bian, Tao
    Jiang, Zhong-Ping
    2014 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT CONTROL (ISIC), 2014, : 536 - 541