Reinforcement learning based on intrinsic motivation and temporal abstraction via transformation invariance

被引:0
作者
Masuyama, Gakuto [1 ]
Yamashita, Atsushi [1 ]
Asama, Hajime [1 ]
机构
[1] Department of Precision Engineering, Faculty of Engineering, University of Tokyo, Bunkyo-ku, Tokyo 113-8656
来源
Masuyama, G. (masuyama@robot.t.u-tokyo.ac.jp) | 1600年 / Japan Society of Mechanical Engineers卷 / 79期
关键词
Knowledge engineering; Learning control; Reinforcement learning; Robot;
D O I
10.1299/kikaic.79.289
中图分类号
学科分类号
摘要
Bottom-up processes have received much attention in unsupervised and developmental learning research domain. In contrast, effectiveness of top-down deeming on acquisition of adaptive behavior is discussed in this paper. Successful experience in the past, or a skill that could be expected to be reused successfully in a novel environment is stored in memory. Then abstract environment recognition via geometric transformation invariance is introduced to measure the reproducibility of executed skill in a novel environment. Additionally, reproducibility of skill in the environment is utilized to make up intrinsic motivation that drives the agent to active conceptualization of search space. It enables the agent to relativize current skill execution robustly in diverse environments. Useful characteristics of top-down deeming process are implemented on reinforcement learning and discussed through simulation experiments in grid world. The results demonstrate acceleration of learning progress by active conceptualization of environment. Additionally, it is shown by experiments for scaled environment that subjective anticipation could bring in consistent strategy of exploration and exploitation. Eligibility trace is also introduced for skill utility problem and it is shown that the traces regarding actions and skills could preserve learning performance for diverse skill settings. ©2013 The Japan Society of Mechanical Engineers.
引用
收藏
页码:289 / 303
页数:14
相关论文
共 18 条
  • [1] Sutton R.S., Barto A.G., Reinforcement Learning:An Introduction, (1998)
  • [2] Weng J., McClelland J., Pentland A., Sporns O., Stockman I., Sur M., Thelen E., Autonomous mental development by robots and animals, Science, 291, 5504, pp. 599-600, (2001)
  • [3] Singh S., Barto A.G., Chentanez N., Intrinsically motivated reinforcement learning, Proceedings of the Advances in Neural Information Processing Systems, 17, pp. 1281-1288, (2005)
  • [4] Stout A., Konidaris G.D., Barto A.G., Intrinsically motivated reinforcement learning: A promising framework for developmental robot learning, AAAI Spring Symposium on Developmental Robotics, pp. 1281-1288, (2005)
  • [5] Sutton R.S., Percup D., Singh S., Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning, Artificial Intelligence, 112, pp. 181-211, (1999)
  • [6] Oudeyer P.-Y., Kaplan F., Hafner V.V., Intrinsic motivation systems for autonomous mental development, IEEE Transactions on Evolutionary Computation, 11, 2, pp. 265-286, (2007)
  • [7] Vigorito C.M., Barto A.G., Intrinsically motivated hierarchical skill learning in structured environments, IEEE Transactions on Autonomous Mental Development, 2, 2, pp. 83-90, (2010)
  • [8] Konidaris G.D., Barto A.G., Building portable options: Skill transfer in reinforcement learning, Proceedings of the 20th International Joint Conference on Artificial Intelligence, 2, pp. 895-900, (2007)
  • [9] Summerfield C., Egner T., Expectation (and Attention) in visual cognition, Trends in Cognitive Sciences, 13, 9, pp. 403-409, (2009)
  • [10] Kunda Z., The case for motivated reasoning, Psychological Bulletin, 103, 3, pp. 480-498, (1990)