Adding prediction risk to the theory of reward learning

被引:100
作者
Preuschoff, Kerstin [1 ]
Bossaerts, Peter [1 ]
机构
[1] CALTECH, Pasadena, CA 91125 USA
来源
REWARD AND DECISION MAKING IN CORTICOBASAL GANGLIA NETWORKS | 2007年 / 1104卷
关键词
reinforcement learning; learning rate; least squares learning; dopaminergic system; reward anticipation; prediction risk; uncertainty; adaptive encoding;
D O I
10.1196/annals.1390.005
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This article analyzes the simple Rescorla-Wagner learning rule from the vantage point of least squares learning theory. In particular, it suggests how measures of risk, such as prediction risk, can be used to adjust the learning constant in reinforcement learning. It argues that prediction risk is most effectively incorporated by scaling the prediction errors. This way, the learning rate needs adjusting only when the covariance between optimal predictions and past (scaled) prediction errors changes. Evidence is discussed that suggests that the dopaminergic system in the (human and nonhuman) primate brain encodes prediction risk, and that prediction errors are indeed scaled with prediction risk (adaptive encoding).
引用
收藏
页码:135 / 146
页数:12
相关论文
共 21 条
[1]  
BEHRENS T, 2006, LEARNING VALUE INFOR
[2]  
BOSSAERTS P, IN PRESS NEW ENCY NE
[3]   Discrete coding of reward probability and uncertainty by dopamine neurons [J].
Fiorillo, CD ;
Tobler, PN ;
Schultz, W .
SCIENCE, 2003, 299 (5614) :1898-1902
[4]   Evidence that the delay-period activity of dopamine neurons corresponds to reward uncertainty rather than backpropagating TD errors [J].
Fiorillo, Christopher D. ;
Tobler, Philippe N. ;
Schultz, Wolfram .
BEHAVIORAL AND BRAIN FUNCTIONS, 2005, 1 (1)
[5]   Anatomy of the insula - functional and clinical correlates [J].
Flynn, FG ;
Benson, DF ;
Ardila, A .
APHASIOLOGY, 1999, 13 (01) :55-78
[6]   Risk aversion and incentive effects [J].
Holt, CA ;
Laury, SK .
AMERICAN ECONOMIC REVIEW, 2002, 92 (05) :1644-1655
[7]   Decisions under uncertainty: Probabilistic context influences activation of prefrontal and parietal cortices [J].
Huettel, SA ;
Song, AW ;
McCarthy, G .
JOURNAL OF NEUROSCIENCE, 2005, 25 (13) :3304-3311
[8]   MEAN-VARIANCE VERSUS DIRECT UTILITY MAXIMIZATION [J].
KROLL, Y ;
LEVY, H ;
MARKOWITZ, HM .
JOURNAL OF FINANCE, 1984, 39 (01) :47-61
[9]   The neural basis of financial risk taking [J].
Kuhnen, CM ;
Knutson, B .
NEURON, 2005, 47 (05) :763-770
[10]   FOUNDATIONS OF PORTFOLIO THEORY [J].
MARKOWITZ, HM .
JOURNAL OF FINANCE, 1991, 46 (02) :469-477