Improvement of LMI controllers of Takagi-Sugeno models via Q-learning

被引：3

作者：

Diaz, Henry ^{[1
]}

Armesto, Leopoldo ^{[1
]}

Sala, Antonio ^{[1
]}

机构：

[1] Univ Politecn Valencia, IDF, Inst Univ Autom Inf Ind AI2, C Camino Vera S-N, E-46022 Valencia, Spain

来源：

IFAC PAPERSONLINE | 2016年 / 49卷 / 05期

关键词：

Reinforcement learning; adaptive dynamic programming; Q-learning; Takagi-Sugeno; LMI; FUZZY CONTROL; SYSTEMS;

D O I：

10.1016/j.ifacol.2016.07.091

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a preliminary attempt to bridge the conservative (shape-independent) results from guaranteed-cost LMIs and the reinforcement learning setups which learn optimal controllers from data. In this sense, the proposed approach uses an initialization based on the LMI solution and proposes an approximation of the Q-function using polynomials of the membership functions in Takagi-Sugeno models. The resulting controller is shape-dependent, that is, uses the knowledge of membership functions and data to clearly improve LMI solutions. (C) 2016, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.

引用

页码：67 / 72

页数：6

共 15 条

[1]

[Anonymous], 1998, REINFORCEMENT LEARNI

[2]

[Anonymous], 2013, REINFORCEMENT LEARNI

[3]

Bellman R. E., 1957, Dynamic programming. Princeton landmarks in mathematics

[4]

Bertsekas D. P., 1996, NEURODYNAMIC PROGRAM

[5]

Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f

[6] Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics [J].

Kiumarsi, Bahare ;

Lewis, Frank L. ;

Modares, Hamidreza ;

Karimpour, Ali ;

Naghibi-Sistani, Mohammad-Bagher .

AUTOMATICA, 2014, 50 (04) :1167-1175

[7]

Lewis F.L., 2009, IEEE CIRC SYST MAG, V9, P9

[8] Reinforcement Learning and Feedback Control USING NATURAL DECISION METHODS TO DESIGN OPTIMAL ADAPTIVE CONTROLLERS [J].

Lewis, Frank L. ;

Vrabie, Draguna ;

Vamvoudakis, Kyriakos G. .

IEEE CONTROL SYSTEMS MAGAZINE, 2012, 32 (06) :76-105

[9]

Powell W.B., 2011, Approximate Dynamic Programming: Solving the curses of dimensionality, V703

[10] On the conservativeness of fuzzy and fuzzy-polynomial control of nonlinear systems [J].

Sala, Antonio .

ANNUAL REVIEWS IN CONTROL, 2009, 33 (01) :48-58

← 1 2 →