Reinforcement Learning with Symbolic Input-Output Models

被引：0

作者：

Derner, Erik ^{[1
,2
]}

Kubalik, Jiri ^{[1
]}

Babska, Robert ^{[1
,3
]}

机构：

[1] Czech Tech Univ, Czech Inst Informat Robot & Cybernet, Prague, Czech Republic

[2] Czech Tech Univ, Fac Elect Engn, Dept Control Engn, Prague, Czech Republic

[3] Delft Univ Technol, Cognit Robot, Fac 3mE, Delft, Netherlands

来源：

2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2018年

关键词：

Model learning; symbolic regression; reinforcement learning; optimal control;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

It is well known that reinforcement learning (RL) can benefit from the use of a dynamic prediction model which is learned on data samples collected online from the process to be controlled. Most RL algorithms are formulated in the state-space domain and use state-space models. However, learning state-space models is difficult, mainly because in the vast majority of problems the full state cannot be measured on the system or reconstructed from the measurements. To circumvent this limitation, we propose to use input-output models of the NARX (nonlinear autoregressive with exogenous input) type. Symbolic regression is employed to construct parsimonious models and the corresponding value functions. Thanks to this approach, we can learn accurate models and compute optimal policies even from small amounts of training data. We demonstrate the approach on two simulated examples, a hopping robot and a 1-DOF robot arm, and on a real inverted pendulum system. Results show that our proposed method can reliably determine a good control policy based on a symbolic input-output process model and value function.

引用

页码：3004 / 3009

页数：6

共 50 条

[1] Sigma*: Symbolic Learning of Input-Output Specifications
Botincan, Matko
Babic, Domagoj
ACM SIGPLAN NOTICES, 2013, 48 (01) : 443 - 455
[2] REINFORCEMENT LEARNING OR TRACKING OF INPUT-OUTPUT MAPS
HEISS, M
APPLIED ARTIFICIAL INTELLIGENCE, 1994, 8 (04) : 483 - 496
[3] Input-Output Manifold Learning with State Space Models
Tanaka, Daisuke
Matsubara, Takamitsu
Sugimoto, Kenji
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (06) : 1179 - 1187
[4] DISAGGREGATING INPUT-OUTPUT MODELS
WOLSKY, AM
REVIEW OF ECONOMICS AND STATISTICS, 1984, 66 (02) : 283 - 291
[5] DYNAMIC INPUT-OUTPUT MODELS
WOLPERT, SA
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1967, 62 (318) : 745 - &
[6] INPUT-OUTPUT MODELS AND VISICALC
CUSTER, SW
AKRON BUSINESS AND ECONOMIC REVIEW, 1986, 17 (02): : 16 - 21
[7] The choice of an input-output table embedded in regional econometric input-output models
Israilevich, PR
Hewings, GJD
Schindler, GR
Mahidhara, R
PAPERS IN REGIONAL SCIENCE, 1996, 75 (02) : 103 - 119
[8] A family of test selection criteria for Timed Input-Output Symbolic Transition System models
Moraes, Alan
Andrade, Wilkerson L.
Machado, Patricia D. L.
SCIENCE OF COMPUTER PROGRAMMING, 2016, 126 : 52 - 72
[9] Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning
Castaneda, Fernando
Wulfman, Mathias
Agrawal, Ayush
Westenbroek, Tyler
Tomlin, Claire J.
Sastry, S. Shankar
Sreenath, Koushil
LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 990 - 999
[10] BASIC EQUATION OF INPUT-OUTPUT MODELS
RIBARIC, M
ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1976, 56 (03): : T263 - T264

← 1 2 3 4 5 →