Bandit Models of Human Behavior: Reward Processing in Mental Disorders

被引:10
作者
Bouneffouf, Djallel [1 ]
Rish, Irina [1 ]
Cecchi, Guillermo A. [1 ]
机构
[1] IBM Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
来源
ARTIFICIAL GENERAL INTELLIGENCE: 10TH INTERNATIONAL CONFERENCE, AGI 2017 | 2017年 / 10414卷
关键词
D O I
10.1007/978-3-319-63703-7_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. We demonstrate empirically that the proposed parametric approach can often outperform the baseline Thompson Sampling on a variety of datasets. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions.
引用
收藏
页码:237 / 248
页数:12
相关论文
共 19 条
[1]  
[Anonymous], 2012, COLT 2012 25 ANN C L
[2]  
Auer P, 2003, SIAM J COMPUT, V32, P48, DOI 10.1137/S0097539701398375
[3]   On-line learning with malicious noise and the closure algorithm [J].
Auer, P ;
Cesa-Bianchi, N .
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 1998, 23 (1-2) :83-99
[4]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[5]   Multi-armed bandit problem with known trend [J].
Bouneffouf, Djallel ;
Feraud, Raphael .
NEUROCOMPUTING, 2016, 205 :16-21
[6]  
Bouneffouf D, 2014, LECT NOTES COMPUT SC, V8836, P373, DOI 10.1007/978-3-319-12643-2_46
[7]  
Chapelle O., 2011, ADV NEURAL INFORM PR, V24
[8]   A Neurocomputational Model for Cocaine Addiction [J].
Dezfouli, Amir ;
Piray, Payam ;
Keramati, Mohammad Mahdi ;
Ekhtiari, Hamed ;
Lucas, Caro ;
Mokri, Azarakhsh .
NEURAL COMPUTATION, 2009, 21 (10) :2869-2893
[9]   Beyond pain: modeling decision-making deficits in chronic pain [J].
Emanuel Hess, Leonardo ;
Haimovici, Ariel ;
Angel Munoz, Miguel ;
Montoya, Pedro .
FRONTIERS IN BEHAVIORAL NEUROSCIENCE, 2014, 8
[10]   By carrot or by stick: Cognitive reinforcement learning in Parkinsonism [J].
Frank, MJ ;
Seeberger, LC ;
O'Reilly, RC .
SCIENCE, 2004, 306 (5703) :1940-1943