Data-Driven Tracking Control for Nonaffine Yaw Channel of Helicopter via Off-Policy Reinforcement Learning

被引：11

作者：

Zhang, Kun ^{[1
]}

Luo, Shijie ^{[1
]}

Wu, Huai-Ning ^{[2
,3
]}

Su, Rong ^{[3
]}

机构：

[1] Beihang Univ, Sch Astronaut, Beijing 100191, Peoples R China

[2] Beihang Univ, Sch Automat Sci & Elect Engn, Sci & Technol Aircraft Control Lab, Beijing 100191, Peoples R China

[3] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 311115, Singapore

来源：

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS | 2025年 / 61卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Heuristic algorithms; Aerodynamics; Rotors; Vehicle dynamics; Nonlinear dynamical systems; Tail; Stability criteria; Reinforcement learning; Mathematical models; Games; Adaptive dynamic programming (ADP); nonaffine systems; reinforcement learning (RL); tracking control; uncrewed aerial vehicle (UAV) helicopter; CONTINUOUS-TIME SYSTEMS;

D O I：

10.1109/TAES.2025.3539264

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

This article presents an off-policy tracking control scheme for the continuous-time nonaffine yaw channel of uncrewed aerial vehicle helicopter. First, the article constructs an affine augmented system (AAS) within a parallel control structure to convert the original nonaffine tracking error dynamics into affine dynamics. Second, the article derives a stability criterion linking the nonaffine system and the AAS, demonstrating that the obtained zero-sum policy from the AAS can achieve the $H_\infty$ performance of the nonaffine system. Third, a data-driven off-policy tracking algorithm is designed for approximating the zero-sum solution of the Hamilton-Jacobi-Isaacs equations with unknown dynamics. Moreover, the recursive least squares process with a variable forgetting factor is employed to update the actor-critic neural network weights, with the algorithm's convergence being proven. Then, the uniformly ultimately bounded of tracking errors is guaranteed. Finally, two application examples are offered in simulation to validate the effectiveness of this presented method.

引用

页码：7725 / 7737

页数：13

共 42 条

[1]

[Anonymous], 1995, NONLINEAR ADAPTIVE C

[2]

Basar T., 2008, optimal control and related min-max problems-A dynamic game approach

[3] A hybrid optimal backstepping and adaptive fuzzy control for autonomous quadrotor helicopter with time-varying disturbance [J].

Basri, Mohd Ariffanan Mohd ;

Husain, Abdul Rashid ;

Danapalasingam, Kumeresan A. .

PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING, 2015, 229 (12) :2178-2195

[4]

Glida H.-E., 2019, P 2019 INT C CONTR A, P1

[5] Recursive Least Squares With Variable-Direction Forgetting: Compensating for the Loss of Persistency [Lecture Notes] [J].

Goel, Ankit ;

Bruce, Adam L. ;

Bernstein, Dennis S. .

IEEE CONTROL SYSTEMS MAGAZINE, 2020, 40 (04) :80-102

[6] Robust Attitude Tracking Control Based on Adaptive Dynamic Programming for Flexible Dumbbell-Shaped Spacecraft [J].

Huang, Wenke ;

Ran, Guangtao ;

Wang, Bohui ;

Li, Dongyu ;

Dong, Wenye .

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2024, 60 (02) :2394-2406

[7] Output Feedback-Based Adaptive Optimal Output Regulation for Continuous-Time Strict-Feedback Nonlinear Systems [J].

Jiang, Yi ;

Chai, Tianyou ;

Chen, Guanrong .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2025, 70 (02) :767-782

[8] Data-Driven H∞ Control of Networked Nonlinear Systems With External Disturbances and Random Communication Packet Losses [J].

Jiang, Yi ;

Xie, Shengli ;

Chen, Guanrong .

IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2024, 11 (03) :1358-1369

[9] Cooperative adaptive optimal output regulation of nonlinear discrete-time multi-agent systems [J].

Jiang, Yi ;

Fan, Jialu ;

Gao, Weinan ;

Chai, Tianyou ;

Lewis, Frank L. .

AUTOMATICA, 2020, 121

[10]

Leishman G., 2006, Principles of helicopter aerodynamics, V2nd

← 1 2 3 4 5 →