Reinforcement Learning-Based Control of Nonlinear Systems Using Lyapunov Stability Concept and Fuzzy Reward Scheme

被引:20
作者
Chen, Ming [1 ]
Lam, Hak Keung [1 ]
Shi, Qian [1 ]
Xiao, Bo [2 ,3 ]
机构
[1] Kings Coll London, Dept Engn, London WC2R 2LS, England
[2] Imperial Coll London, Hamlyn Ctr Robot Surg, London SW7 2AZ, England
[3] Imperial Coll London, Dept Comp, London SW7 2AZ, England
关键词
Optimization; Fuzzy logic; Lyapunov methods; Circuit stability; Stability analysis; Aerospace electronics; Circuits and systems; Proximal policy optimization (PPO); adjustable policy learning rate (APLR); Lyapunov reward system; fuzzy reward system; cart-pole inverted pendulum; DESIGN;
D O I
10.1109/TCSII.2019.2947682
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this brief, a reinforcement learning-based control approach for nonlinear systems is presented. The proposed control approach offers a design scheme of the adjustable policy learning rate (APLR) to reduce the influence imposed by negative or large advantages, which improves the learning stability of the proximal policy optimization (PPO) algorithm. Besides, this brief puts forward a Lyapunov-fuzzy reward system to further promote the learning efficiency. In addition, the proposed control approach absorbs the Lyapunov stability concept into the design of the Lyapunov reward system and a particular fuzzy reward system is set up using the knowledge of the cart-pole inverted pendulum and fuzzy inference system (FIS). The merits of the proposed approach are validated by simulation examples.
引用
收藏
页码:2059 / 2063
页数:5
相关论文
共 25 条
  • [1] [Anonymous], 2015, COMPUTER SCI
  • [2] [Anonymous], 2015, Nature, DOI [10.1038/nature14539, DOI 10.1038/NATURE14539]
  • [3] [Anonymous], Prox. Policy Optim. Algorithms
  • [4] Toward Self-Driving Bicycles Using State-of-the-Art Deep Reinforcement Learning Algorithms
    Choi, SeungYoon
    Le, Tuyen P.
    Nguyen, Quang D.
    Abu Layek, Md
    Lee, SeungGwan
    Chung, TaeChoong
    [J]. SYMMETRY-BASEL, 2019, 11 (02):
  • [5] Hamalainen P., 2018, PPO CMA PROXIMAL POL
  • [6] Imitation Reinforcement Learning-Based Remote Rotary Inverted Pendulum Control in OpenFlow Network
    Kim, Ju-Bong
    Lim, Hyun-Kyo
    Kim, Chan-Myung
    Kim, Min-Suk
    Hong, Yong-Geun
    Han, Youn-Hee
    [J]. IEEE ACCESS, 2019, 7 : 36682 - 36690
  • [7] Kim S. K., IEEE T CIRCUITS SYST
  • [8] King DB, 2015, ACS SYM SER, V1214, P1
  • [9] Energy management in solar microgrid via reinforcement learning using fuzzy reward
    Kofinas, Panagiotis
    Vouros, George
    Dounis, Anastasios I.
    [J]. ADVANCES IN BUILDING ENERGY RESEARCH, 2018, 12 (01) : 97 - 115
  • [10] A review on stability analysis of continuous-time fuzzy-model-based control systems: From membership-function-independent to membership-function-dependent analysis
    Lam, H. K.
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2018, 67 : 390 - 408