Learning optimal decisions with confidence

被引：40

作者：

Drugowitsch, Jan ^{[1
]}

Mendonca, Andre G. ^{[2
]}

Mainen, Zachary F. ^{[2
]}

Pouget, Alexandre ^{[3
]}

机构：

[1] Harvard Med Sch, Dept Neurobiol, Boston, MA 02115 USA

[2] Champalimaud Ctr Unknown, Champalimaud Res, P-1400038 Lisbon, Portugal

[3] Univ Geneva, Dept Basic Neurosci, CH-1211 Geneva, Switzerland

来源：

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA | 2019年 / 116卷 / 49期

基金：

欧洲研究理事会; 瑞士国家科学基金会;

关键词：

decision making; diffusion models; optimality; confidence; MODELS; UNCERTAINTY; INFERENCE; MOTION; MEMORY; BRAIN;

D O I：

10.1073/pnas.1906787116

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Diffusion decision models (DDMs) are immensely successful models for decision making under uncertainty and time pressure. In the context of perceptual decision making, these models typically start with two input units, organized in a neuron-antineuron pair. In contrast, in the brain, sensory inputs are encoded through the activity of large neuronal populations. Moreover, while DDMs are wired by hand, the nervous system must learn the weights of the network through trial and error. There is currently no normative theory of learning in DDMs and therefore no theory of how decision makers could learn to make optimal decisions in this context. Here, we derive such a rule for learning a near-optimal linear combination of DDM inputs based on trial-by-trial feedback. The rule is Bayesian in the sense that it learns not only the mean of the weights but also the uncertainty around this mean in the form of a covariance matrix. In this rule, the rate of learning is proportional (respectively, inversely proportional) to confidence for incorrect (respectively, correct) decisions. Furthermore, we show that, in volatile environments, the rule predicts a bias toward repeating the same choice after correct decisions, with a bias strength that is modulated by the previous choice's difficulty. Finally, we extend our learning rule to cases for which one of the choices is more likely a priori, which provides insights into how such biases modulate the mechanisms leading to optimal decisions in diffusion models.

引用

页码：24872 / 24880

页数：9

共 41 条

[1]

[Anonymous], ADV NEURAL INFORM PR

[2] Neural correlations, population coding and computation [J].

Averbeck, BB ;

Latham, PE ;

Pouget, A .

NATURE REVIEWS NEUROSCIENCE, 2006, 7 (05) :358-366

[3] Probabilistic Population Codes for Bayesian Decision Making [J].

Beck, Jeffrey M. ;

Ma, Wei Ji ;

Kiani, Roozbeh ;

Hanks, Tim ;

Churchland, Anne K. ;

Roitman, Jamie ;

Shadlen, Michael N. ;

Latham, Peter E. ;

Pouget, Alexandre .

NEURON, 2008, 60 (06) :1142-1152

[4]

Berger J. O., 1985, Statistical Decision Theory and Bayesian Analysis, V2, DOI 10.1007/978-1-4757-4286-2

[5]

BISHOP C. M., 2006, Pattern recognition and machine learning, DOI [DOI 10.1117/1.2819119, 10.1007/978-0-387-45528-0]

[6] The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced-choice tasks [J].

Bogacz, Rafal ;

Brown, Eric ;

Moehlis, Jeff ;

Holmes, Philip ;

Cohen, Jonathan D. .

PSYCHOLOGICAL REVIEW, 2006, 113 (04) :700-765

[7] The Detection of Visual Contrast in the Behaving Mouse [J].

Busse, Laura ;

Ayaz, Asli ;

Dhruv, Neel T. ;

Katzner, Steffen ;

Saleem, Aman B. ;

Schoelvinck, Marieke L. ;

Zaharia, Andrew D. ;

Carandini, Matteo .

JOURNAL OF NEUROSCIENCE, 2011, 31 (31) :11351-11361

[8]

Chu W., 2011, P 17 ACM SIGKDD INT, P195, DOI DOI 10.1145/2020408.2020444

[9] Bayesian theories of conditioning in a changing world [J].

Courville, Aaron C. ;

Daw, Nathaniel D. ;

Touretzky, David S. .

TRENDS IN COGNITIVE SCIENCES, 2006, 10 (07) :294-300

[10]

Cover T.M., 2006, ELEMENTS INFORM THEO ELEMENTS INFORM THEO, V2 nd, DOI 10.1002/0471200611

← 1 2 3 4 5 →