A Reinforcement Learning Approach to Understanding Procrastination: Does Inaccurate Value Approximation Cause Irrational Postponing of a Task?

被引:1
|
作者
Feng, Zheyu [1 ]
Nagase, Asako Mitsuto [1 ,2 ,3 ,4 ]
Morita, Kenji [1 ,5 ]
机构
[1] Univ Tokyo, Grad Sch Educ, Phys & Hlth Educ, Tokyo, Japan
[2] Tottori Univ, Fac Med, Dept Brain & Neurosci, Div Neurol, Yonago, Tottori, Japan
[3] Japan Soc Promot Sci, Res Fellowship Young Scientists, Tokyo, Japan
[4] Shimane Univ, Fac Med, Dept Neurol, Izumo, Japan
[5] Univ Tokyo, Int Res Ctr Neurointelligence WPI IRCN, Tokyo, Japan
基金
日本学术振兴会;
关键词
procrastination; value-based decision making; reinforcement learning; temporal difference learning; state representation; successor representation; dimension reduction; SUCCESSOR REPRESENTATION; NEURAL MECHANISMS; DOPAMINE SIGNALS; BASAL GANGLIA; AVOIDANCE; FRAMEWORK; SELECTION; REWARDS; SYSTEMS; MODELS;
D O I
10.3389/fnins.2021.660595
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Procrastination is the voluntary but irrational postponing of a task despite being aware that the delay can lead to worse consequences. It has been extensively studied in psychological field, from contributing factors, to theoretical models. From value-based decision making and reinforcement learning (RL) perspective, procrastination has been suggested to be caused by non-optimal choice resulting from cognitive limitations. Exactly what sort of cognitive limitations are involved, however, remains elusive. In the current study, we examined if a particular type of cognitive limitation, namely, inaccurate valuation resulting from inadequate state representation, would cause procrastination. Recent work has suggested that humans may adopt a particular type of state representation called the successor representation (SR) and that humans can learn to represent states by relatively low-dimensional features. Combining these suggestions, we assumed a dimension-reduced version of SR. We modeled a series of behaviors of a "student" doing assignments during the school term, when putting off doing the assignments (i.e., procrastination) is not allowed, and during the vacation, when whether to procrastinate or not can be freely chosen. We assumed that the "student" had acquired a rigid reduced SR of each state, corresponding to each step in completing an assignment, under the policy without procrastination. The "student" learned the approximated value of each state which was computed as a linear function of features of the states in the rigid reduced SR, through temporal-difference (TD) learning. During the vacation, the "student" made decisions at each time-step whether to procrastinate based on these approximated values. Simulation results showed that the reduced SR-based RL model generated procrastination behavior, which worsened across episodes. According to the values approximated by the "student," to procrastinate was the better choice, whereas not to procrastinate was mostly better according to the true values. Thus, the current model generated procrastination behavior caused by inaccurate value approximation, which resulted from the adoption of the reduced SR as state representation. These findings indicate that the reduced SR, or more generally, the dimension reduction in state representation, can be a potential form of cognitive limitation that leads to procrastination.
引用
收藏
页数:18
相关论文
共 1 条
  • [1] A grey approximation approach to state value function in reinforcement learning
    Hwang, Kao-Shing
    Chen, Yu-Jen
    Lee, Guar-Yuan
    2007 IEEE INTERNATIONAL CONFERENCE ON INTEGRATION TECHNOLOGY, PROCEEDINGS, 2007, : 379 - +