Multi-timescale nexting in a reinforcement learning robot

被引:37
|
作者
Modayil, Joseph [1 ]
White, Adam [1 ]
Sutton, Richard S. [1 ]
机构
[1] Univ Alberta, Reinforcement Learning & Artificial Intelligence, Edmonton, AB, Canada
关键词
Reinforcement learning; robotics; predictive knowledge; temporal difference learning; FUTURE;
D O I
10.1177/1059712313511648
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The term nexting' has been used by psychologists to refer to the propensity of people and many other animals to continually predict what will happen next in an immediate, local, and personal sense. The ability to next' constitutes a basic kind of awareness and knowledge of one's environment. In this paper we present results with a robot that learns to next in real time, making thousands of predictions about sensory input signals at timescales from 0.1 to 8 seconds. Our predictions are formulated as a generalization of the value functions commonly used in reinforcement learning, where now an arbitrary function of the sensory input signals is used as a pseudo reward, and the discount rate determines the timescale. We show that six thousand predictions, each computed as a function of six thousand features of the state, can be learned and updated online ten times per second on a laptop computer, using the standard temporal-difference() algorithm with linear function approximation. This approach is sufficiently computationally efficient to be used for real-time learning on the robot and sufficiently data efficient to achieve substantial accuracy within 30 minutes. Moreover, a single tile-coded feature representation suffices to accurately predict many different signals over a significant range of timescales. We also extend nexting beyond simple timescales by letting the discount rate be a function of the state and show that nexting predictions of this more general form can also be learned with substantial accuracy. General nexting provides a simple yet powerful mechanism for a robot to acquire predictive knowledge of the dynamics of its environment.
引用
收藏
页码:146 / 160
页数:15
相关论文
共 50 条
  • [41] Multi-timescale systems and fast-slow analysis
    Bertram, Richard
    Rubin, Jonathan E.
    MATHEMATICAL BIOSCIENCES, 2017, 287 : 105 - 121
  • [42] Multi-Timescale Perceptual History Resolves Visual Ambiguity
    Brascamp, Jan W.
    Knapen, Tomas H. J.
    Kanai, Ryota
    Noest, Andre J.
    van Ee, Raymond
    van den Berg, Albert V.
    PLOS ONE, 2008, 3 (01):
  • [43] Dynamical behavior of multi-timescale adaptive threshold model
    Yamauchi, Satoshi
    Kim, Hideaki
    Shinomoto, Shigeru
    NEUROSCIENCE RESEARCH, 2010, 68 : E434 - E434
  • [44] Data-Predictive Control of Multi-Timescale Processes
    Tang, Jun Wen
    Yan, Yitao
    Bao, Jie
    Huang, Biao
    2022 IEEE INTERNATIONAL SYMPOSIUM ON ADVANCED CONTROL OF INDUSTRIAL PROCESSES (ADCONIP 2022), 2022, : 73 - 77
  • [45] Temporal dendritic heterogeneity incorporated with spiking neural networks for learning multi-timescale dynamics
    Hanle Zheng
    Zhong Zheng
    Rui Hu
    Bo Xiao
    Yujie Wu
    Fangwen Yu
    Xue Liu
    Guoqi Li
    Lei Deng
    Nature Communications, 15
  • [46] Multi-timescale fuzzy controller for a continuum with a moving oscillator
    Lin, J
    IEE PROCEEDINGS-CONTROL THEORY AND APPLICATIONS, 2004, 151 (03): : 310 - 318
  • [47] Developing a multi-timescale PIC code for plasma accelerators
    Deng, S
    Wang, X
    Katsouleas, T
    Mori, WB
    2005 IEEE PARTICLE ACCELERATOR CONFERENCE (PAC), VOLS 1-4, 2005, : 2733 - 2735
  • [48] Multi-agent deep reinforcement learning for efficient multi-timescale bidding of a hybrid power plant in day-ahead and real-time markets
    Ochoa, Tomas
    Gil, Esteban
    Angulo, Alejandro
    Valle, Carlos
    APPLIED ENERGY, 2022, 317
  • [49] On a multi-timescale statistical feedback model for volatility fluctuations
    Borland, Lisa
    Bouchaud, Jean-Philippe
    JOURNAL OF INVESTMENT STRATEGIES, 2011, 1 (01): : 65 - 104
  • [50] A multi-timescale schedule strategy for multi-microgrids: A distributed approach
    Li, Zhaoyu
    Ai, Qian
    Zhang, Yufan
    INTERNATIONAL TRANSACTIONS ON ELECTRICAL ENERGY SYSTEMS, 2021, 31 (09)