Bandit Models of Human Behavior: Reward Processing in Mental Disorders

被引：10

作者：

Bouneffouf, Djallel ^{[1
]}

Rish, Irina ^{[1
]}

Cecchi, Guillermo A. ^{[1
]}

机构：

[1] IBM Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

ARTIFICIAL GENERAL INTELLIGENCE: 10TH INTERNATIONAL CONFERENCE, AGI 2017 | 2017年 / 10414卷

关键词：

D O I：

10.1007/978-3-319-63703-7_22

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. We demonstrate empirically that the proposed parametric approach can often outperform the baseline Thompson Sampling on a variety of datasets. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions.

引用

页码：237 / 248

页数：12

共 19 条

[1]

[Anonymous], 2012, COLT 2012 25 ANN C L

[2]

Auer P, 2003, SIAM J COMPUT, V32, P48, DOI 10.1137/S0097539701398375

[3] On-line learning with malicious noise and the closure algorithm [J].

Auer, P ;

Cesa-Bianchi, N .

ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 1998, 23 (1-2) :83-99

[4] Finite-time analysis of the multiarmed bandit problem [J].

Auer, P ;

Cesa-Bianchi, N ;

Fischer, P .

MACHINE LEARNING, 2002, 47 (2-3) :235-256

[5] Multi-armed bandit problem with known trend [J].

Bouneffouf, Djallel ;

Feraud, Raphael .

NEUROCOMPUTING, 2016, 205 :16-21

[6]

Bouneffouf D, 2014, LECT NOTES COMPUT SC, V8836, P373, DOI 10.1007/978-3-319-12643-2_46

[7]

Chapelle O., 2011, ADV NEURAL INFORM PR, V24

[8] A Neurocomputational Model for Cocaine Addiction [J].

Dezfouli, Amir ;

Piray, Payam ;

Keramati, Mohammad Mahdi ;

Ekhtiari, Hamed ;

Lucas, Caro ;

Mokri, Azarakhsh .

NEURAL COMPUTATION, 2009, 21 (10) :2869-2893

[9] Beyond pain: modeling decision-making deficits in chronic pain [J].

Emanuel Hess, Leonardo ;

Haimovici, Ariel ;

Angel Munoz, Miguel ;

Montoya, Pedro .

FRONTIERS IN BEHAVIORAL NEUROSCIENCE, 2014, 8

[10] By carrot or by stick: Cognitive reinforcement learning in Parkinsonism [J].

Frank, MJ ;

Seeberger, LC ;

O'Reilly, RC .

SCIENCE, 2004, 306 (5703) :1940-1943

← 1 2 →