Reinforcement Learning with Symbolic Input-Output Models

被引:0
|
作者
Derner, Erik [1 ,2 ]
Kubalik, Jiri [1 ]
Babska, Robert [1 ,3 ]
机构
[1] Czech Tech Univ, Czech Inst Informat Robot & Cybernet, Prague, Czech Republic
[2] Czech Tech Univ, Fac Elect Engn, Dept Control Engn, Prague, Czech Republic
[3] Delft Univ Technol, Cognit Robot, Fac 3mE, Delft, Netherlands
来源
2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2018年
关键词
Model learning; symbolic regression; reinforcement learning; optimal control;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is well known that reinforcement learning (RL) can benefit from the use of a dynamic prediction model which is learned on data samples collected online from the process to be controlled. Most RL algorithms are formulated in the state-space domain and use state-space models. However, learning state-space models is difficult, mainly because in the vast majority of problems the full state cannot be measured on the system or reconstructed from the measurements. To circumvent this limitation, we propose to use input-output models of the NARX (nonlinear autoregressive with exogenous input) type. Symbolic regression is employed to construct parsimonious models and the corresponding value functions. Thanks to this approach, we can learn accurate models and compute optimal policies even from small amounts of training data. We demonstrate the approach on two simulated examples, a hopping robot and a 1-DOF robot arm, and on a real inverted pendulum system. Results show that our proposed method can reliably determine a good control policy based on a symbolic input-output process model and value function.
引用
收藏
页码:3004 / 3009
页数:6
相关论文
共 50 条
  • [1] Sigma*: Symbolic Learning of Input-Output Specifications
    Botincan, Matko
    Babic, Domagoj
    ACM SIGPLAN NOTICES, 2013, 48 (01) : 443 - 455
  • [2] REINFORCEMENT LEARNING OR TRACKING OF INPUT-OUTPUT MAPS
    HEISS, M
    APPLIED ARTIFICIAL INTELLIGENCE, 1994, 8 (04) : 483 - 496
  • [3] Input-Output Manifold Learning with State Space Models
    Tanaka, Daisuke
    Matsubara, Takamitsu
    Sugimoto, Kenji
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (06) : 1179 - 1187
  • [4] DISAGGREGATING INPUT-OUTPUT MODELS
    WOLSKY, AM
    REVIEW OF ECONOMICS AND STATISTICS, 1984, 66 (02) : 283 - 291
  • [5] DYNAMIC INPUT-OUTPUT MODELS
    WOLPERT, SA
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1967, 62 (318) : 745 - &
  • [6] INPUT-OUTPUT MODELS AND VISICALC
    CUSTER, SW
    AKRON BUSINESS AND ECONOMIC REVIEW, 1986, 17 (02): : 16 - 21
  • [7] The choice of an input-output table embedded in regional econometric input-output models
    Israilevich, PR
    Hewings, GJD
    Schindler, GR
    Mahidhara, R
    PAPERS IN REGIONAL SCIENCE, 1996, 75 (02) : 103 - 119
  • [8] A family of test selection criteria for Timed Input-Output Symbolic Transition System models
    Moraes, Alan
    Andrade, Wilkerson L.
    Machado, Patricia D. L.
    SCIENCE OF COMPUTER PROGRAMMING, 2016, 126 : 52 - 72
  • [9] Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning
    Castaneda, Fernando
    Wulfman, Mathias
    Agrawal, Ayush
    Westenbroek, Tyler
    Tomlin, Claire J.
    Sastry, S. Shankar
    Sreenath, Koushil
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 990 - 999
  • [10] BASIC EQUATION OF INPUT-OUTPUT MODELS
    RIBARIC, M
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1976, 56 (03): : T263 - T264