Multi-timescale nexting in a reinforcement learning robot

被引:37
|
作者
Modayil, Joseph [1 ]
White, Adam [1 ]
Sutton, Richard S. [1 ]
机构
[1] Univ Alberta, Reinforcement Learning & Artificial Intelligence, Edmonton, AB, Canada
关键词
Reinforcement learning; robotics; predictive knowledge; temporal difference learning; FUTURE;
D O I
10.1177/1059712313511648
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The term nexting' has been used by psychologists to refer to the propensity of people and many other animals to continually predict what will happen next in an immediate, local, and personal sense. The ability to next' constitutes a basic kind of awareness and knowledge of one's environment. In this paper we present results with a robot that learns to next in real time, making thousands of predictions about sensory input signals at timescales from 0.1 to 8 seconds. Our predictions are formulated as a generalization of the value functions commonly used in reinforcement learning, where now an arbitrary function of the sensory input signals is used as a pseudo reward, and the discount rate determines the timescale. We show that six thousand predictions, each computed as a function of six thousand features of the state, can be learned and updated online ten times per second on a laptop computer, using the standard temporal-difference() algorithm with linear function approximation. This approach is sufficiently computationally efficient to be used for real-time learning on the robot and sufficiently data efficient to achieve substantial accuracy within 30 minutes. Moreover, a single tile-coded feature representation suffices to accurately predict many different signals over a significant range of timescales. We also extend nexting beyond simple timescales by letting the discount rate be a function of the state and show that nexting predictions of this more general form can also be learned with substantial accuracy. General nexting provides a simple yet powerful mechanism for a robot to acquire predictive knowledge of the dynamics of its environment.
引用
收藏
页码:146 / 160
页数:15
相关论文
共 50 条
  • [31] Modelling of multi-timescale demand response for power markets
    Zhou, Dan
    Dai, Huiwen
    Chen, Feng
    Lou, Boliang
    Ren, Zhiwei
    INTERNATIONAL JOURNAL OF MODELLING IDENTIFICATION AND CONTROL, 2019, 33 (03) : 237 - 245
  • [32] Multi-timescale nonlinear robust control for a miniature helicopter
    Xu, Yunjun
    2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2008, : 2546 - 2551
  • [33] Impacts of Multi-Timescale Circulations on Meridional Moisture Transport
    Liu, Qiao
    Li, Tim
    Zhou, Weican
    JOURNAL OF CLIMATE, 2021, 34 (19) : 8065 - 8085
  • [34] Multi-Timescale Sensitive Movement Technologies: the EnTimeMent project
    Camurri, Antonio
    2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2019, : 362 - 363
  • [35] Multi-timescale infrared quantum cascade laser ellipsometry
    Furchner, Andreas
    Kratz, Christoph
    Rappich, Joerg
    Hinrichs, Karsten
    OPTICS LETTERS, 2022, 47 (11) : 2834 - 2837
  • [36] MULTI-TIMESCALE CONTEXT ENCODING FOR SCENE PARSING PREDICTION
    Chen, Xin
    Han, Yahong
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1624 - 1629
  • [37] Multi-timescale analysis of rainfall in Karst in Guizhou, China
    Li, X. N.
    Zhao, X. J.
    Xu, B.
    Chen, G. C.
    3RD INTERNATIONAL CONFERENCE ON WATER RESOURCE AND ENVIRONMENT (WRE 2017), 2017, 82
  • [38] Multi-Timescale Nonlinear Robust Control for a Miniature Helicopter
    Xu, Yunjun
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2010, 46 (02) : 656 - 671
  • [39] Impacts of multi-timescale circulations on meridional heat transport
    Liu, Qiao
    Li, Tim
    Zhou, Weican
    INTERNATIONAL JOURNAL OF CLIMATOLOGY, 2022, 42 (04) : 2153 - 2168
  • [40] Multi-timescale Performance of Groundwater Drought in Connection with Climate
    Zhu, Ruirui
    Zheng, Hongxing
    Jakeman, Anthony J.
    Chiew, Francis H. S.
    WATER RESOURCES MANAGEMENT, 2023, 37 (09) : 3599 - 3614