共 20 条
- [1] Wiewiora E., Cottrell G.W., Elkan C., Principled methods for advising reinforcement learning agents, Proc. of the 20th Int'l Conf. on Machine Learning, pp. 792-799, (2003)
- [2] Babes M., Munoz de Cote E., Littman M.L., Social reward shaping in the prisoner's dilemma, Proc. of the 7th Int'l Joint Conf. on Autonomous Agents and Multi-Agent Systems, 3, pp. 1389-1392, (2008)
- [3] Marthi B., Automatic shaping and decomposition of reward functions, Proc. of the 24th Int'l Conf. on Machine Learning, pp. 601-608, (2007)
- [4] Randlv J., Alstrm P., Learning to drive a bicycle using reinforcement learning and shaping, Proc. of the 15th Int'l Conf. on Machine Learning, pp. 463-471, (1998)
- [5] Dorigo M., Colombetti M., Robot shaping: Developing autonomous agents through learning, Artificial Intelligence, 71, 2, pp. 321-370, (1994)
- [6] Mataric M.J., Reward functions for accelerated learning, Proc. of the 11th Int'l Conf. on Machine Learning, pp. 181-189, (1994)
- [7] Ng A.Y., Harada D., Russell S.J., Policy invariance under reward transformations: Theory and application to reward shaping, Proc. of the 16th Int'l Conf. on Machine Learning, pp. 278-287, (1999)
- [8] Devlin S., Kudenko D., Dynamic potential-based reward shaping, Proc. of the 11th Int'l Joint Conf. on Autonomous Agents and Multiagent Systems, pp. 433-440, (2012)
- [9] Ng A.Y., Russell S.J., Algorithms for inverse reinforcement learning, Proc. of the 17th Int'l Conf. on Machine Learning, pp. 663-670, (2000)
- [10] Ziebart B.D., Maas A.L., Bagnell J.A., Dey A.K., Maximum entropy inverse reinforcement learning, Proc. of the 23rd AAAI Conf. on Artificial Intelligence, pp. 1433-1438, (2008)