Inverse Q-Learning Optimal Control for Takagi-Sugeno Fuzzy Systems

被引:0
作者
Song, Wenting [1 ]
Ning, Jun [1 ]
Tong, Shaocheng [1 ,2 ]
机构
[1] Dalian Maritime Univ, Nav Coll, Dalian 116026, Peoples R China
[2] Liaoning Univ Technol, Coll Sci, Jinzhou 121000, Peoples R China
基金
中国国家自然科学基金;
关键词
Optimal control; Fuzzy systems; Q-learning; Cost function; Expert systems; Approximation algorithms; Linear systems; Heuristic algorithms; Differential games; Vectors; Fuzzy inverse reinforcement learning optimal control; Q-learning algorithm; Takagi-Sugeno (T-S) fuzzy systems; zero-sum differential game; STABILITY; DESIGN;
D O I
10.1109/TFUZZ.2025.3563361
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inverse reinforcement learning optimal control is under the framework of learner-expert, the learner system can learn expert system's trajectory and optimal control policy via a reinforcement learning algorithm and does not need the predefined cost function, so it can solve optimal control problem effectively. This article develops a fuzzy inverse reinforcement learning optimal control scheme with inverse reinforcement learning algorithm for Takagi-Sugeno (T-S) fuzzy systems with disturbances. Since the controlled fuzzy systems (learner systems) desire to learn or imitate expert system's behavior trajectories, a learner-expert structure is established, where the learner only know the expert system's optimal control policy. To reconstruct expert system's cost function, we develop a model-free inverse Q-learning algorithm that consists of two learning stages: an inner Q-learning iteration loop and an outer inverse optimal iteration loop. The inner loop aims to find fuzzy optimal control policy and the worst-case disturbance input via learner system's cost function by employing zero-sum differential game theory. The outer one is to update learner system's state-penalty weight via only observing expert systems' optimal control policy. The model-free algorithm does not require that the controlled system dynamics are known. It is proved that the designed algorithm is convergent and also the developed inverse reinforcement learning optimal control policy can ensure T-S fuzzy learner system to obtain Nash equilibrium solution. Finally, we apply the presented fuzzy inverse Q-learning optimal control method to nonlinear unmanned surface vehicle system and the computer simulation results verified the effectiveness of the developed scheme.
引用
收藏
页码:2308 / 2320
页数:13
相关论文
共 35 条
[1]   Robust State/Fault Estimation and Fault-Tolerant Control in Discrete-Time T-S Fuzzy Systems: An Embedded Smoothing Signal Model Approach [J].
Chen, Bor-Sen ;
Lee, Min-Yen ;
Lin, Tzu-Han ;
Zhang, Weihai .
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (07) :6886-6900
[2]   Noncooperative and Cooperative Strategy Designs for Nonlinear Stochastic Jump Diffusion Systems With External Disturbance: T-S Fuzzy Approach [J].
Chen, Bor-Sen ;
Lee, Min-Yen .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (10) :2437-2451
[3]   Dynamic Event-Triggered Asynchronous Control for Nonlinear Multiagent Systems Based on T-S Fuzzy Models [J].
Chen, Mengshen ;
Yan, Huaicheng ;
Zhang, Hao ;
Chi, Ming ;
Li, Zhichen .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2021, 29 (09) :2580-2592
[4]   Observer-Based Output Feedback Control for Discrete-Time T-S Fuzzy Systems With Partly Immeasurable Premise Variables [J].
Dong, Jiuxiang ;
Yang, Guang-Hong .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2017, 47 (01) :98-110
[5]   Fuzzy-Based Adaptive Optimization of Unknown Discrete-Time Nonlinear Markov Jump Systems With Off-Policy Reinforcement Learning [J].
Fang, Haiyang ;
Tu, Yidong ;
Wang, Hai ;
He, Shuping ;
Liu, Fei ;
Ding, Zhengtao ;
Cheng, Shing Shin .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (12) :5276-5290
[6]  
Fu JS, 2018, Arxiv, DOI [arXiv:1710.11248, 10.48550/arXiv.1710.11248]
[7]   Adaptive Dynamic Programming and Adaptive Optimal Output Regulation of Linear Systems [J].
Gao, Weinan ;
Jiang, Zhong-Ping .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) :4164-4169
[8]  
Lewis F., 2012, OPTIMAL CONTROL, V3rd, DOI [10.1002/9781118122631, DOI 10.1002/9781118122631]
[9]   Integral Reinforcement Learning for Linear Continuous-Time Zero-Sum Games With Completely Unknown Dynamics [J].
Li, Hongliang ;
Liu, Derong ;
Wang, Ding .
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2014, 11 (03) :706-714
[10]   Adaptive Finite-Time Controller Design for T-S Fuzzy Systems [J].
Li, Yue ;
Liu, Lu ;
Feng, Gang .
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (09) :2425-2436