Testing models of context-dependent outcome encoding in reinforcement learning

被引：9

作者：

Hayes, William M. ^{[1
,2
]}

Wedell, Douglas H. ^{[1
]}

机构：

[1] Univ South Carolina, Dept Psychol, Columbia, SC 29208 USA

[2] Univ South Carolina, Dept Psychol, 1512 Pendleton St, Columbia, SC 29208 USA

来源：

COGNITION | 2023年 / 230卷

关键词：

Relative encoding; Decisions from experience; Range -frequency theory; Reference point dependence; Decision by sampling; DECISION; ADAPTATION; REPRESENTATIONS; NORMALIZATION; PERCEPTIONS; EXPERIENCE; JUDGMENT; RECALL; MEMORY; PRICE;

D O I：

10.1016/j.cognition.2022.105280

中图分类号：

B84 [心理学];

学科分类号：

04 ; 0402 ;

摘要：

Previous studies of reinforcement learning (RL) have established that choice outcomes are encoded in a contextdependent fashion. Several computational models have been proposed to explain context-dependent encoding, including reference point centering and range adaptation models. The former assumes that outcomes are centered around a running estimate of the average reward in each choice context, while the latter assumes that outcomes are compared to the minimum reward and then scaled by an estimate of the range of outcomes in each choice context. However, there are other computational mechanisms that can explain context dependence in RL. In the present study, a frequency encoding model is introduced that assumes outcomes are evaluated based on their proportional rank within a sample of recently experienced outcomes from the local context. A rangefrequency model is also considered that combines the range adaptation and frequency encoding mechanisms. We conducted two fully incentivized behavioral experiments using choice tasks for which the candidate models make divergent predictions. The results were most consistent with models that incorporate frequency or rankbased encoding. The findings from these experiments deepen our understanding of the underlying computational processes mediating context-dependent outcome encoding in human RL.

引用

页数：24

共 73 条

[1] Small feedback-based decisions and their limited correspondence to description-based decisions [J].

Barron, G ;

Erev, I .

JOURNAL OF BEHAVIORAL DECISION MAKING, 2003, 16 (03) :215-233

[2]

Bavard S, 2021, SOC NEUROECONOMICS 1

[3] Two sides of the same coin: Beneficial and detrimental consequences of range adaptation in human reinforcement learning [J].

Bavard, Sophie ;

Rustichini, Aldo ;

Palminteri, Stefano .

SCIENCE ADVANCES, 2021, 7 (14)

[4] Reference-point centering and range-adaptation enhance human reinforcement learning at the cost of irrational preferences [J].

Bavard, Sophie ;

Lebreton, Mael ;

Khamassi, Mehdi ;

Coricelli, Giorgio ;

Palminteri, Stefano .

NATURE COMMUNICATIONS, 2018, 9

[5] Decision by Sampling Implements Efficient Coding of Psychoeconomic Functions [J].

Bhui, Rahul ;

Gershman, Samuel J. .

PSYCHOLOGICAL REVIEW, 2018, 125 (06) :985-1001

[6] USING CONTEXTUAL EFFECTS TO DERIVE PSYCHOPHYSICAL SCALES [J].

BIRNBAUM, MH .

PERCEPTION & PSYCHOPHYSICS, 1974, 15 (01) :89-96

[7] Decision by sampling and memory distinctiveness: range effects from rank-based models of judgment and choice [J].

Brown, Gordon D. A. ;

Matthews, William J. .

FRONTIERS IN PSYCHOLOGY, 2011, 2

[8] Partial Adaptation of Obtained and Observed Value Signals Preserves Information about Gains and Losses [J].

Burke, Christopher J. ;

Baddeley, Michelle ;

Toblers, Philippew N. ;

Schultz, Wolfram .

JOURNAL OF NEUROSCIENCE, 2016, 36 (39) :10016-10025

[9] Normalization as a canonical neural computation [J].

Carandini, Matteo ;

Heeger, David J. .

NATURE REVIEWS NEUROSCIENCE, 2012, 13 (01) :51-62

[10]

Choplin JM, 2014, JUDGM DECIS MAK, V9, P243

← 1 2 3 4 5 6 7 8 →