共 46 条
HPo tracking control for perturbed discrete-time systems using On/Off policy Q-learning algorithms
被引:1
作者:

论文数: 引用数:
h-index:
机构:

论文数: 引用数:
h-index:
机构:
机构:
[1] Hanoi Univ Sci & Technol, Sch Elect & Elect Engn, Hanoi, Vietnam
关键词:
Perturbed discrete-time systems;
Q-learning;
On/off policy algorithm;
Model-free control;
Reinforcement learning control;
ADAPTIVE OPTIMAL-CONTROL;
D O I:
10.1016/j.chaos.2025.116459
中图分类号:
O1 [数学];
学科分类号:
0701 ;
070101 ;
摘要:
The widely studied HPo zero-sum game problem guarantees the integration of external disturbance into the optimal control problem. In this article, two model-free Q-learning algorithms based on HPo tracking control are proposed for perturbed discrete-time systems in the presence of external disturbance. Moreover, modification of the output optimal control problem is also made. For the optimal tracking control problem, the existence of a discount factor is necessary to guarantee the final value of the cost function, and the Ricatti equation is modified. With the aid of the deviation between Q functions at two consecutive times and the original principle of Off/On policy, the consideration of HPo zero-sum game problem, two On/Off Q-learning algorithms based on HPo tracking control are proposed. Then, by computing the Q function, the influence of probing noise on the Q function is considered. The analysis of solution equivalence proves that convergence and tracking are guaranteed in the proposed algorithm. Eventually, simulation studies are carried out on F-16 aircraft to assess the validity of the presented control schemes.
引用
收藏
页数:20
相关论文
共 46 条
[1]
Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design
[J].
Bian, Tao
;
Jiang, Zhong-Ping
.
AUTOMATICA,
2016, 71
:348-360

Bian, Tao
论文数: 0 引用数: 0
h-index: 0
机构:
NYU, Tandon Sch Engn, Dept Elect & Comp Engn, Control & Networks Lab,Metrotech Ctr 5, Brooklyn, NY 11201 USA NYU, Tandon Sch Engn, Dept Elect & Comp Engn, Control & Networks Lab,Metrotech Ctr 5, Brooklyn, NY 11201 USA

Jiang, Zhong-Ping
论文数: 0 引用数: 0
h-index: 0
机构:
NYU, Tandon Sch Engn, Dept Elect & Comp Engn, Control & Networks Lab,Metrotech Ctr 5, Brooklyn, NY 11201 USA NYU, Tandon Sch Engn, Dept Elect & Comp Engn, Control & Networks Lab,Metrotech Ctr 5, Brooklyn, NY 11201 USA
[2]
Data-Based Robust Adaptive Dynamic Programming for Balancing Control Performance and Energy Consumption in Wastewater Treatment Process
[J].
Cao, Weiwei
;
Yang, Qinmin
;
Meng, Wenchao
;
Xie, Shuzong
.
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS,
2024, 20 (04)
:6622-6630

Cao, Weiwei
论文数: 0 引用数: 0
h-index: 0
机构:
Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China

Yang, Qinmin
论文数: 0 引用数: 0
h-index: 0
机构:
Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China

Meng, Wenchao
论文数: 0 引用数: 0
h-index: 0
机构:
Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China

Xie, Shuzong
论文数: 0 引用数: 0
h-index: 0
机构:
Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China Zhejiang Univ, Coll Control Sci & Engn, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China
[3]
Adaptive Optimal Control of Unknown Nonlinear Systems via Homotopy-Based Policy Iteration
[J].
Chen, Ci
;
Lewis, Frank L.
;
Xie, Kan
;
Xie, Shengli
.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL,
2024, 69 (05)
:3396-3403

Chen, Ci
论文数: 0 引用数: 0
h-index: 0
机构:
Guangdong Univ Technol, Sch Automat, Guangdong Key Lab IoT Informat Technol, Guangzhou 510006, Peoples R China
Minist Educ, Key Lab Intelligent Informat Proc & Syst Integrat, Guangzhou 510006, Peoples R China Guangdong Univ Technol, Sch Automat, Guangdong Key Lab IoT Informat Technol, Guangzhou 510006, Peoples R China

Lewis, Frank L.
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Texas Arlington, UTA Res Inst, Ft Worth, TX 76118 USA Guangdong Univ Technol, Sch Automat, Guangdong Key Lab IoT Informat Technol, Guangzhou 510006, Peoples R China

Xie, Kan
论文数: 0 引用数: 0
h-index: 0
机构:
Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China
111 Ctr Intelligent Batch Mfg Based IoT Technol, Guangzhou 510006, Peoples R China Guangdong Univ Technol, Sch Automat, Guangdong Key Lab IoT Informat Technol, Guangzhou 510006, Peoples R China

Xie, Shengli
论文数: 0 引用数: 0
h-index: 0
机构:
Guangdong Univ Technol, Sch Automat, Key Lab Intelligent Detect & Internet Things Mfg, Guangzhou 510006, Peoples R China
Guangdong Hong Kong Macao Joint Lab oratory Smart, Guangzhou, Peoples R China Guangdong Univ Technol, Sch Automat, Guangdong Key Lab IoT Informat Technol, Guangzhou 510006, Peoples R China
[4]
A novel Z-function-based completely model-free reinforcement learning method to finite-horizon zero-sum game of nonlinear system
[J].
Chen, Zhe
;
Xue, Wenqian
;
Li, Ning
;
Lian, Bosen
;
Lewis, Frank L.
.
NONLINEAR DYNAMICS,
2022, 107 (03)
:2563-2582

Chen, Zhe
论文数: 0 引用数: 0
h-index: 0
机构:
Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China
Minist Educ China, Key Lab Syst Control & Informat Proc, Shanghai, Peoples R China
Shanghai Engn Res Ctr Intelligent Control & Manag, Shanghai, Peoples R China Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China

Xue, Wenqian
论文数: 0 引用数: 0
h-index: 0
机构:
Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang, Peoples R China
Northeastern Univ, Int Joint Res Lab Integrated Automat, Shenyang, Peoples R China Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China

Li, Ning
论文数: 0 引用数: 0
h-index: 0
机构:
Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China
Minist Educ China, Key Lab Syst Control & Informat Proc, Shanghai, Peoples R China
Shanghai Engn Res Ctr Intelligent Control & Manag, Shanghai, Peoples R China Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China

Lian, Bosen
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Texas Arlington, UTA Res Inst, Arlington, TX 76019 USA Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China

Lewis, Frank L.
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Texas Arlington, UTA Res Inst, Arlington, TX 76019 USA Shanghai Jiao Tong Univ, Dept Automat, Shanghai, Peoples R China
[5]
Adaptive prescribed performance second-order sliding mode tracking control of autonomous underwater vehicle using neural network-based disturbance observer
[J].
Ding, Zhongjun
;
Wang, Haipeng
;
Sun, Yanchao
;
Qin, Hongde
.
OCEAN ENGINEERING,
2022, 260

Ding, Zhongjun
论文数: 0 引用数: 0
h-index: 0
机构:
Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin 15001, Peoples R China
Natl Deep Sea Ctr, Qingdao 266237, Peoples R China Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin 15001, Peoples R China

Wang, Haipeng
论文数: 0 引用数: 0
h-index: 0
机构:
Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin 15001, Peoples R China Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin 15001, Peoples R China

Sun, Yanchao
论文数: 0 引用数: 0
h-index: 0
机构:
Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin 15001, Peoples R China Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin 15001, Peoples R China

Qin, Hongde
论文数: 0 引用数: 0
h-index: 0
机构:
Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin 15001, Peoples R China Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin 15001, Peoples R China
[6]
Optimal control of a two-wheeled self-balancing robot by reinforcement learning
[J].
Guo, Linyuan
;
Rizvi, Syed Ali Asad
;
Lin, Zongli
.
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL,
2021, 31 (06)
:1885-1904

Guo, Linyuan
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Virginia, Charles L Brown Dept Elect & Comp Engn, Charlottesville, VA 22904 USA Univ Virginia, Charles L Brown Dept Elect & Comp Engn, Charlottesville, VA 22904 USA

Rizvi, Syed Ali Asad
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Virginia, Charles L Brown Dept Elect & Comp Engn, Charlottesville, VA 22904 USA Univ Virginia, Charles L Brown Dept Elect & Comp Engn, Charlottesville, VA 22904 USA

Lin, Zongli
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Virginia, Charles L Brown Dept Elect & Comp Engn, Charlottesville, VA 22904 USA Univ Virginia, Charles L Brown Dept Elect & Comp Engn, Charlottesville, VA 22904 USA
[7]
Robust control for affine nonlinear systems under the reinforcement learning framework
[J].
Guo, Wenxin
;
Qin, Weiwei
;
Lan, Xuguang
;
Liu, Jieyu
;
Zhang, Zhaoxiang
.
NEUROCOMPUTING,
2024, 587

Guo, Wenxin
论文数: 0 引用数: 0
h-index: 0
机构:
Xian Res Inst High Tech, Xian 710025, Peoples R China Xian Res Inst High Tech, Xian 710025, Peoples R China

Qin, Weiwei
论文数: 0 引用数: 0
h-index: 0
机构:
Xian Res Inst High Tech, Xian 710025, Peoples R China Xian Res Inst High Tech, Xian 710025, Peoples R China

Lan, Xuguang
论文数: 0 引用数: 0
h-index: 0
机构:
Xi An Jiao Tong Univ, Xian 710049, Peoples R China Xian Res Inst High Tech, Xian 710025, Peoples R China

Liu, Jieyu
论文数: 0 引用数: 0
h-index: 0
机构:
Xian Res Inst High Tech, Xian 710025, Peoples R China Xian Res Inst High Tech, Xian 710025, Peoples R China

Zhang, Zhaoxiang
论文数: 0 引用数: 0
h-index: 0
机构:
Xian Res Inst High Tech, Xian 710025, Peoples R China Xian Res Inst High Tech, Xian 710025, Peoples R China
[8]
ITERATIVE TECHNIQUE FOR COMPUTATION OF STEADY STATE GAINS FOR DISCRETE OPTIMAL REGULATOR
[J].
HEWER, GA
.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL,
1971, AC16 (04)
:382-+

HEWER, GA
论文数: 0 引用数: 0
h-index: 0
[9]
Adaptive integral-sliding-mode control strategy for maneuvering control of F16 aircraft subject to aerodynamic uncertainty
[J].
Ijaz, Salman
;
Chen Fuyang
;
Hamayun, Mirza Tariq
;
Anwaar, Haris
.
APPLIED MATHEMATICS AND COMPUTATION,
2021, 402

Ijaz, Salman
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Nottingham, Dept Elect & Elect Engn, Ningbo China UNNC, Ningbo, Peoples R China
Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing, Peoples R China Univ Nottingham, Dept Elect & Elect Engn, Ningbo China UNNC, Ningbo, Peoples R China

Chen Fuyang
论文数: 0 引用数: 0
h-index: 0
机构:
Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing, Peoples R China Univ Nottingham, Dept Elect & Elect Engn, Ningbo China UNNC, Ningbo, Peoples R China

Hamayun, Mirza Tariq
论文数: 0 引用数: 0
h-index: 0
机构:
COMSATS Univ Islamabad, Dept Elect & Comp Engn, Lahore Campus, Lahore, Pakistan Univ Nottingham, Dept Elect & Elect Engn, Ningbo China UNNC, Ningbo, Peoples R China

Anwaar, Haris
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Engn & Technol, Dept Elect Comp & Telecommun Engn, Lahore, Pakistan Univ Nottingham, Dept Elect & Elect Engn, Ningbo China UNNC, Ningbo, Peoples R China
[10]
Bias-policy iteration based adaptive dynamic programming for unknown continuous-time linear systems
[J].
Jiang, Huaiyuan
;
Zhou, Bin
.
AUTOMATICA,
2022, 136

Jiang, Huaiyuan
论文数: 0 引用数: 0
h-index: 0
机构:
Harbin Inst Technol, Ctr Control Theory & Guidance Technol, POB 416, Harbin 150001, Peoples R China Harbin Inst Technol, Ctr Control Theory & Guidance Technol, POB 416, Harbin 150001, Peoples R China

Zhou, Bin
论文数: 0 引用数: 0
h-index: 0
机构:
Harbin Inst Technol, Ctr Control Theory & Guidance Technol, POB 416, Harbin 150001, Peoples R China Harbin Inst Technol, Ctr Control Theory & Guidance Technol, POB 416, Harbin 150001, Peoples R China