Computational modeling of epiphany learning

被引:18
作者
Chen, Wei James [1 ]
Krajbich, Ian [1 ,2 ]
机构
[1] Ohio State Univ, Dept Econ, Columbus, OH 43210 USA
[2] Ohio State Univ, Dept Psychol, Columbus, OH 43210 USA
基金
美国国家科学基金会;
关键词
epiphany learning; eye tracking; pupil dilation; beauty contest; decision making; EYE-TRACKING; DECISION-MAKING; INFORMATION SEARCH; VISUAL FIXATIONS; PUPIL-DILATION; REACTION-TIME; CHOICE; GAMES; ATTENTION; DYNAMICS;
D O I
10.1073/pnas.1618161114
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Models of reinforcement learning (RL) are prevalent in the decision-making literature, but not all behavior seems to conform to the gradual convergence that is a central feature of RL. In some cases learning seems to happen all at once. Limited prior research on these "epiphanies" has shown evidence of sudden changes in behavior, but it remains unclear how such epiphanies occur. We propose a sequential-sampling model of epiphany learning (EL) and test it using an eye-tracking experiment. In the experiment, subjects repeatedly play a strategic game that has an optimal strategy. Subjects can learn over time from feedback but are also allowed to commit to a strategy at any time, eliminating all other options and opportunities to learn. We find that the EL model is consistent with the choices, eye movements, and pupillary responses of subjects who commit to the optimal strategy (correct epiphany) but not always of those who commit to a suboptimal strategy or who do not commit at all. Our findings suggest that EL is driven by a latent evidence accumulation process that can be revealed with eye-tracking data.
引用
收藏
页码:4637 / 4642
页数:6
相关论文
共 53 条
[1]  
[Anonymous], 1998, REINFORCEMENT LEARNI
[2]   An initial implementation of the Turing tournament to learning in repeated two-person games [J].
Arifovic, Jasmina ;
McKelvey, Richard D. ;
Pevnitskaya, Svetlana .
GAMES AND ECONOMIC BEHAVIOR, 2006, 57 (01) :93-122
[3]  
Ashby NJS, 2012, JUDGM DECIS MAK, V7, P254
[4]   Learning, risk attitude and hot stoves in restless bandit problems [J].
Biele, Guido ;
Erev, Ido ;
Ert, Eyal .
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2009, 53 (03) :155-167
[5]   The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced-choice tasks [J].
Bogacz, Rafal ;
Brown, Eric ;
Moehlis, Jeff ;
Holmes, Philip ;
Cohen, Jonathan D. .
PSYCHOLOGICAL REVIEW, 2006, 113 (04) :700-765
[6]   Naive reinforcement learning with endogenous aspirations [J].
Börgers, T ;
Sarin, R .
INTERNATIONAL ECONOMIC REVIEW, 2000, 41 (04) :921-950
[7]   APPLICATION OF A MODEL TO PAIRED-ASSOCIATE LEARNING [J].
BOWER, GH .
PSYCHOMETRIKA, 1961, 26 (03) :255-280
[8]   Experience-weighted attraction learning in normal form games [J].
Camerer, C ;
Ho, TH .
ECONOMETRICA, 1999, 67 (04) :827-874
[9]  
Camerer CF., 2004, PSYCHOL EC DECISIONS, V2, P111
[10]   Eye Tracking and Pupillometry Are Indicators of Dissociable Latent Decision Processes [J].
Cavanagh, James F. ;
Wiecki, Thomas V. ;
Kochar, Angad ;
Frank, Michael J. .
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2014, 143 (04) :1476-1488