INTERNALLY DRIVEN Q-LEARNING Convergence and Generalization Results

被引:0
作者
Alonso, Eduardo [1 ]
Mondragon, Esther
Kjaell-Ohlsson, Niclas [1 ]
机构
[1] City Univ London, Dept Comp, London EC1V 0HB, England
来源
ICAART: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1 | 2012年
关键词
Q-learning; IDQ-learning; Internal Drives; Convergence; Generalization;
D O I
10.5220/0003736404910494
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an approach to solving the reinforcement learning problem in which agents are provided with internal drives against which they evaluate the value of the states according to a similarity function. We extend Q-learning by substituting internally driven values for ad hoc rewards. The resulting algorithm, Internally Driven Q-learning (IDQ-learning), is experimentally proved to convergence to optimality and to generalize well. These results are preliminary yet encouraging. IDQ-learning is more psychologically plausible than Q-learning, and it devolves control and thus autonomy to agents that are otherwise at the mercy of the environment (i.e., of the designer).
引用
收藏
页码:491 / 494
页数:4
相关论文
共 2 条
[1]  
Sutton R.S., 2017, Introduction to reinforcement learning
[2]  
WATKINS CJCH, 1992, MACH LEARN, V8, P279, DOI 10.1007/BF00992698