Differential effects of reward and punishment in decision making under uncertainty: a computational study

被引：2

作者：

Duffin, Elaine ^{[1
]}

Bland, Amy R. ^{[2
]}

Schaefer, Alexandre ^{[3
]}

de Kamps, Marc ^{[1
]}

机构：

[1] Univ Leeds, Sch Comp, Leeds LS2 9JT, W Yorkshire, England

[2] Univ Manchester, Neurosci & Psychiat Unit, Manchester, Lancs, England

[3] Monash Univ, Sch Business, Bandar Sunway, Malaysia

来源：

FRONTIERS IN NEUROSCIENCE | 2014年 / 8卷

基金：

英国生物技术与生命科学研究理事会; 英国工程与自然科学研究理事会;

关键词：

decision making; uncertainty; volatility; reward and punishment; Bayesian learning; reinforcement learning; LOSS-AVERSION; DOPAMINE; ATTENTION; CHOICES; LOSSES; ROLES; GO;

D O I：

10.3389/fnins.2014.00030

中图分类号：

Q189 [神经科学];

学科分类号：

071006 ;

摘要：

Computational models of learning have proved largely successful in characterizing potential mechanisms which allow humans to make decisions in uncertain and volatile contexts. We report here findings that extend existing knowledge and show that a modified reinforcement learning model, which has separate parameters according to whether the previous trial gave a reward or a punishment, can provide the best fit to human behavior in decision making under uncertainty. More specifically, we examined the fit of our modified reinforcement learning model to human behavioral data in a probabilistic two-alternative decision making task with rule reversals. Our results demonstrate that this model predicted human behavior better than a series of other models based on reinforcement learning or Bayesian reasoning. Unlike the Bayesian models, our modified reinforcement learning model does not include any representation of rule switches. When our task is considered purely as a machine learning task, to gain as many rewards as possible without trying to describe human behavior, the performance of modified reinforcement learning and Bayesian methods is similar. Others have used various computational models to describe human behavior in similar tasks, however, we are not aware of any who have compared Bayesian reasoning with reinforcement learning modified to differentiate rewards and punishments.

引用

页数：13

共 31 条

[1] Learning the value of information in an uncertain world [J].

Behrens, Timothy E. J. ;

Woolrich, Mark W. ;

Walton, Mark E. ;

Rushworth, Matthew F. S. .

NATURE NEUROSCIENCE, 2007, 10 (09) :1214-1221

[2] Different varieties of uncertainty in human decision-making [J].

Bland, Amy R. ;

Schaefer, Alexandre .

FRONTIERS IN NEUROSCIENCE, 2012, 6

[3] Electrophysiological correlates of decision making under varying levels of uncertainty [J].

Bland, Amy R. ;

Schaefer, Alexandre .

BRAIN RESEARCH, 2011, 1417 :55-66

[4] Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective [J].

Botvinick, Matthew M. ;

Niv, Yael ;

Barto, Andrew C. .

COGNITION, 2009, 113 (03) :262-280

[5] When optimal choices feel wrong: A laboratory study of Bayesian updating, complexity, and affect [J].

Charness, G ;

Levin, D .

AMERICAN ECONOMIC REVIEW, 2005, 95 (04) :1300-1309

[6] Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration [J].

Cohen, Jonathan D. ;

McClure, Samuel M. ;

Yu, Angela J. .

PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2007, 362 (1481) :933-942

[7] Cortical substrates for exploratory decisions in humans [J].

Daw, Nathaniel D. ;

O'Doherty, John P. ;

Dayan, Peter ;

Seymour, Ben ;

Dolan, Raymond J. .

NATURE, 2006, 441 (7095) :876-879

[8]

Duffin E., 2011, THESIS U LEEDS

[9] Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning [J].

Frank, Michael J. ;

Moustafa, Ahmed A. ;

Haughey, Heather M. ;

Curran, Tim ;

Hutchison, Kent E. .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (41) :16311-16316

[10] Dynamic dopamine modulation in the basal ganglia: A neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism [J].

Frank, MJ .

JOURNAL OF COGNITIVE NEUROSCIENCE, 2005, 17 (01) :51-72

← 1 2 3 4 →