Autonomous Learning of State Representations for Control: An Emerging Field Aims to Autonomously Learn State Representations for Reinforcement Learning Agents from Their Real-World Sensor Observations

被引:32
作者
Böhmer W. [1 ]
Springenberg J.T. [2 ]
Boedecker J. [2 ]
Riedmiller M. [2 ]
Obermayer K. [1 ]
机构
[1] Neural Information Processing Group, Technische Universität Berlin, Sekr. MAR 5-6, Marchstrasse 23, Berlin
[2] Machine Learning Lab, Universtät Freiburg, Freiburg
来源
KI - Kunstliche Intelligenz | 2015年 / 29卷 / 04期
关键词
Autonomous robotics; Deep auto-encoder networks; End-to-end reinforcement learning; Representation learning; Slow feature analysis;
D O I
10.1007/s13218-015-0356-1
中图分类号
学科分类号
摘要
This article reviews an emerging field that aims for autonomous reinforcement learning (RL) directly on sensor-observations. Straightforward end-to-end RL has recently shown remarkable success, but relies on large amounts of samples. As this is not feasible in robotics, we review two approaches to learn intermediate state representations from previous experiences: deep auto-encoders and slow-feature analysis. We analyze theoretical properties of the representations and point to potential improvements. © 2015, Springer-Verlag Berlin Heidelberg.
引用
收藏
页码:353 / 362
页数:9
相关论文
共 57 条
[1]  
Belkin M., Niyogi P., Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput, 15, 6, pp. 1373-1396, (2003)
[2]  
Bellman R.E., Dynamic Programming, (1957)
[3]  
Bengio Y., Lamblin P., Popovici D., Larochelle H., Greedy layer-wise training of deep networks, Advances in Neural Information Processing Systems, (2007)
[4]  
Bohmer W., Grunewalder S., Nickisch H., Obermayer K., Generating feature spaces for linear algorithms with regularized sparse kernel slow feature analysis, Mach Learn, 89, 1-2, pp. 67-86, (2012)
[5]  
Bohmer W., Grunewalder S., Shen Y., Musial M., Obermayer K., Construction of approximation spaces for reinforcement learning, J Mach Learn Res, 14, pp. 2067-2118, (2013)
[6]  
Bohmer W., Obermayer K., Towards structural generalization: Factored approximate planning, ICRA Workshop on Autonomous Learning, (2013)
[7]  
Boutilier C., Dean T., Hanks S., Decision-theoretic planning: structural assumptions and computational leverage, J Artif Intell Res, 11, pp. 1-94, (1999)
[8]  
Boyan J.A., Moore A.W., Generalization in reinforcement learning: Safely approximating the value function, Advances in Neural Information Processing Systems, pp. 369-376, (1995)
[9]  
Bradtke S.J., Barto A.G., Linear least-squares algorithms for temporal difference learning, Mach Learn, 22, 1-3, pp. 33-57, (1996)
[10]  
Dzeroski S., Raedt L.D., Drissens K., Relational reinforcement learning, Mach Learn, 43, pp. 7-52, (2001)