HMM for discovering decision-making dynamics using reinforcement learning experiments

被引：0

作者：

Guo, Xingche ^{[1
]}

Zeng, Donglin ^{[2
]}

Wang, Yuanjia ^{[1
,3
]}

机构：

[1] Columbia Univ, Dept Biostat, 722 West 168th St, New York, NY 10032 USA

[2] Univ Michigan, Dept Biostat, 1415 Washington Hts, Ann Arbor, MI 48109 USA

[3] Columbia Univ, Dept Psychiat, 1051 Riverside Dr, New York, NY 10032 USA

来源：

BIOSTATISTICS | 2024年

基金：

美国国家卫生研究院;

关键词：

behavioral phenotyping; brain-behavior association; mental health; reinforcement learning; reward tasks; state-switching; PSYCHIATRY; TASK;

D O I：

10.1093/biostatistics/kxae033

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Major depressive disorder (MDD), a leading cause of years of life lived with disability, presents challenges in diagnosis and treatment due to its complex and heterogeneous nature. Emerging evidence indicates that reward processing abnormalities may serve as a behavioral marker for MDD. To measure reward processing, patients perform computer-based behavioral tasks that involve making choices or responding to stimulants that are associated with different outcomes, such as gains or losses in the laboratory. Reinforcement learning (RL) models are fitted to extract parameters that measure various aspects of reward processing (e.g. reward sensitivity) to characterize how patients make decisions in behavioral tasks. Recent findings suggest the inadequacy of characterizing reward learning solely based on a single RL model; instead, there may be a switching of decision-making processes between multiple strategies. An important scientific question is how the dynamics of strategies in decision-making affect the reward learning ability of individuals with MDD. Motivated by the probabilistic reward task within the Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) study, we propose a novel RL-HMM (hidden Markov model) framework for analyzing reward-based decision-making. Our model accommodates decision-making strategy switching between two distinct approaches under an HMM: subjects making decisions based on the RL model or opting for random choices. We account for continuous RL state space and allow time-varying transition probabilities in the HMM. We introduce a computationally efficient Expectation-maximization (EM) algorithm for parameter estimation and use a nonparametric bootstrap for inference. Extensive simulation studies validate the finite-sample performance of our method. We apply our approach to the EMBARC study to show that MDD patients are less engaged in RL compared to the healthy controls, and engagement is associated with brain activities in the negative affect circuitry during an emotional conflict task.

引用

页数：16

共 32 条

[1] Abbeel P., 2004, P 21 INT C MACH LEAR, P1, DOI [10.1145/1015330.1015430, DOI 10.1145/1015330.1015430]
[2] Efficient Implementations of the Generalized Lasso Dual Path Algorithm
Arnold, Taylor B.
Tibshirani, Ryan J.
[J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2016, 25 (01) : 1 - 27
[3] Mice alternate between discrete strategies during perceptual decision-making
Ashwood, Zoe C.
Roy, Nicholas A.
Stone, Iris R.
Urai, Anne E.
Churchland, Anne K.
Pouget, Alexandre
Pillow, Jonathan W.
[J]. NATURE NEUROSCIENCE, 2022, 25 (02) : 201 - +
[4] A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS
BAUM, LE
PETRIE, T
SOULES, G
WEISS, N
[J]. ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01): : 164 - &
[5] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
BENJAMINI, Y
HOCHBERG, Y
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
[6] A LIMITED MEMORY ALGORITHM FOR BOUND CONSTRAINED OPTIMIZATION
BYRD, RH
LU, PH
NOCEDAL, J
ZHU, CY
[J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1995, 16 (05) : 1190 - 1208
[7] Chen CS, 2021, ELIFE, V10, DOI [10.7554/eLife.69748, 10.7554/eLife.69748.sa0, 10.7554/eLife.69748.sa1, 10.7554/eLife.69748.sa2]
[8] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
DEMPSTER, AP
LAIRD, NM
RUBIN, DB
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
[9] Resolving emotional conflict: A role for the rostral anterior cingulate cortex in modulating activity in the amygdala
Etkin, Amit
Egner, Tobias
Peraza, Daniel M.
Kandel, Eric R.
Hirsch, Joy
[J]. NEURON, 2006, 51 (06) : 871 - 882
[10] Brain regulation of emotional conflict predicts antidepressant treatment response for depression
Fonzo, Gregory A.
Etkin, Amit
Zhang, Yu
Wu, Wei
Cooper, Crystal
Chin-Fatt, Cherise
Jha, Manish K.
Trombello, Joseph
Deckersbach, Thilo
Adams, Phil
McInnis, Melvin
McGrath, Patrick J.
Weissman, Myrna M.
Fava, Maurizio
Trivedi, Madhukar H.
[J]. NATURE HUMAN BEHAVIOUR, 2019, 3 (12) : 1319 - 1331

← 1 2 3 4 →