INTERNALLY DRIVEN Q-LEARNING Convergence and Generalization Results

被引：0

作者：

Alonso, Eduardo ^{[1
]}

Mondragon, Esther

Kjaell-Ohlsson, Niclas ^{[1
]}

机构：

[1] City Univ London, Dept Comp, London EC1V 0HB, England

来源：

ICAART: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1 | 2012年

关键词：

Q-learning; IDQ-learning; Internal Drives; Convergence; Generalization;

D O I：

10.5220/0003736404910494

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an approach to solving the reinforcement learning problem in which agents are provided with internal drives against which they evaluate the value of the states according to a similarity function. We extend Q-learning by substituting internally driven values for ad hoc rewards. The resulting algorithm, Internally Driven Q-learning (IDQ-learning), is experimentally proved to convergence to optimality and to generalize well. These results are preliminary yet encouraging. IDQ-learning is more psychologically plausible than Q-learning, and it devolves control and thus autonomy to agents that are otherwise at the mercy of the environment (i.e., of the designer).

引用

页码：491 / 494

页数：4