Incremental Approximate Dynamic Programming for Nonlinear Adaptive Tracking Control with Partial Observability

被引：32

作者：

Zhou, Ye ^{[1
,2
]}

van Kampen, Erik-Jan ^{[1
]}

Chu, QiPing ^{[1
]}

机构：

[1] Delft Univ Technol, Control & Operat Div, Aerosp Engn, NL-2629 HS Delft, South Holland, Netherlands

[2] Univ Sains Malaysia, Sch Aerosp Engn, George Town, Malaysia

来源：

JOURNAL OF GUIDANCE CONTROL AND DYNAMICS | 2018年 / 41卷 / 12期

关键词：

MODEL-PREDICTIVE CONTROL; OUTPUT-FEEDBACK CONTROL; FLIGHT CONTROL; SPACECRAFT; INVERSION; STATE;

D O I：

10.2514/1.G003472

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

Approximate dynamic programming is a class of reinforcement learning, which solves adaptive, optimal control problems and tackles the curse of dimensionality with function approximators. Within this category, linear approximate dynamic programming provides a model-free control method by systematically using a quadratic cost-to-go function. Although efficient, linear approximate dynamic programming methods are difficult to apply to nonlinear systems or time-varying systems. To overcome the above limitations, this paper proposes an adaptive nonlinear tracking control method based on incremental approximate dynamic programming, which combines the advantages of linear approximate dynamic programming and incremental nonlinear control techniques. This is a model-free method for unknown, nonlinear systems and time-varying references. The trait of separating the local model information from the cost function approximation makes this method an option for partially observable control problems. This paper, therefore, proposes two reference tracking controllers for different observability conditions: the direct measurement of the full state, and the partially observable tracking error. In each condition, two algorithms are developed for off-line learning and online learning, respectively. These algorithms are applied to attitude control of a spacecraft disturbed by internal liquid sloshing. The results demonstrate that the proposed algorithms accurately deal with the unknown, time-varying internal dynamics while retaining efficient control, even with only partial observability.

引用

页码：2554 / 2567

页数：14

共 54 条

[1] Convex programming approach to powered descent guidance for Mars landing [J].