HMM for discovering decision-making dynamics using reinforcement learning experiments

被引:0
作者
Guo, Xingche [1 ]
Zeng, Donglin [2 ]
Wang, Yuanjia [1 ,3 ]
机构
[1] Columbia Univ, Dept Biostat, 722 West 168th St, New York, NY 10032 USA
[2] Univ Michigan, Dept Biostat, 1415 Washington Hts, Ann Arbor, MI 48109 USA
[3] Columbia Univ, Dept Psychiat, 1051 Riverside Dr, New York, NY 10032 USA
基金
美国国家卫生研究院;
关键词
behavioral phenotyping; brain-behavior association; mental health; reinforcement learning; reward tasks; state-switching; PSYCHIATRY; TASK;
D O I
10.1093/biostatistics/kxae033
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Major depressive disorder (MDD), a leading cause of years of life lived with disability, presents challenges in diagnosis and treatment due to its complex and heterogeneous nature. Emerging evidence indicates that reward processing abnormalities may serve as a behavioral marker for MDD. To measure reward processing, patients perform computer-based behavioral tasks that involve making choices or responding to stimulants that are associated with different outcomes, such as gains or losses in the laboratory. Reinforcement learning (RL) models are fitted to extract parameters that measure various aspects of reward processing (e.g. reward sensitivity) to characterize how patients make decisions in behavioral tasks. Recent findings suggest the inadequacy of characterizing reward learning solely based on a single RL model; instead, there may be a switching of decision-making processes between multiple strategies. An important scientific question is how the dynamics of strategies in decision-making affect the reward learning ability of individuals with MDD. Motivated by the probabilistic reward task within the Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) study, we propose a novel RL-HMM (hidden Markov model) framework for analyzing reward-based decision-making. Our model accommodates decision-making strategy switching between two distinct approaches under an HMM: subjects making decisions based on the RL model or opting for random choices. We account for continuous RL state space and allow time-varying transition probabilities in the HMM. We introduce a computationally efficient Expectation-maximization (EM) algorithm for parameter estimation and use a nonparametric bootstrap for inference. Extensive simulation studies validate the finite-sample performance of our method. We apply our approach to the EMBARC study to show that MDD patients are less engaged in RL compared to the healthy controls, and engagement is associated with brain activities in the negative affect circuitry during an emotional conflict task.
引用
收藏
页数:16
相关论文
共 32 条
  • [1] Abbeel P., 2004, P 21 INT C MACH LEAR, P1, DOI [10.1145/1015330.1015430, DOI 10.1145/1015330.1015430]
  • [2] Efficient Implementations of the Generalized Lasso Dual Path Algorithm
    Arnold, Taylor B.
    Tibshirani, Ryan J.
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2016, 25 (01) : 1 - 27
  • [3] Mice alternate between discrete strategies during perceptual decision-making
    Ashwood, Zoe C.
    Roy, Nicholas A.
    Stone, Iris R.
    Urai, Anne E.
    Churchland, Anne K.
    Pouget, Alexandre
    Pillow, Jonathan W.
    [J]. NATURE NEUROSCIENCE, 2022, 25 (02) : 201 - +
  • [4] A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS
    BAUM, LE
    PETRIE, T
    SOULES, G
    WEISS, N
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01): : 164 - &
  • [5] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [6] A LIMITED MEMORY ALGORITHM FOR BOUND CONSTRAINED OPTIMIZATION
    BYRD, RH
    LU, PH
    NOCEDAL, J
    ZHU, CY
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1995, 16 (05) : 1190 - 1208
  • [7] Chen CS, 2021, ELIFE, V10, DOI [10.7554/eLife.69748, 10.7554/eLife.69748.sa0, 10.7554/eLife.69748.sa1, 10.7554/eLife.69748.sa2]
  • [8] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [9] Resolving emotional conflict: A role for the rostral anterior cingulate cortex in modulating activity in the amygdala
    Etkin, Amit
    Egner, Tobias
    Peraza, Daniel M.
    Kandel, Eric R.
    Hirsch, Joy
    [J]. NEURON, 2006, 51 (06) : 871 - 882
  • [10] Brain regulation of emotional conflict predicts antidepressant treatment response for depression
    Fonzo, Gregory A.
    Etkin, Amit
    Zhang, Yu
    Wu, Wei
    Cooper, Crystal
    Chin-Fatt, Cherise
    Jha, Manish K.
    Trombello, Joseph
    Deckersbach, Thilo
    Adams, Phil
    McInnis, Melvin
    McGrath, Patrick J.
    Weissman, Myrna M.
    Fava, Maurizio
    Trivedi, Madhukar H.
    [J]. NATURE HUMAN BEHAVIOUR, 2019, 3 (12) : 1319 - 1331