A Critical Period for Robust Curriculum-Based Deep Reinforcement Learning of Sequential Action in a Robot Arm

被引:4
|
作者
de Kleijn, Roy [1 ]
Sen, Deniz [2 ]
Kachergis, George [3 ]
机构
[1] Leiden Univ, Leiden Inst Brain & Cognit, Leiden, Netherlands
[2] Leiden Univ, Math Inst, Leiden, Netherlands
[3] Stanford Univ, Language & Cognit Lab, Stanford, CA 94305 USA
关键词
Curriculum learning; Movement optimization; Reinforcement learning; Robotic arm control; Sequential action; GENERATION;
D O I
10.1111/tops.12595
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Many everyday activities are sequential in nature. That is, they can be seen as a sequence of subactions and sometimes subgoals. In the motor execution of sequential action, context effects are observed in which later subactions modulate the execution of earlier subactions (e.g., reaching for an overturned mug, people will optimize their grasp to achieve a comfortable end state). A trajectory (movement) adaptation of an often-used paradigm in the study of sequential action, the serial response time task, showed several context effects of which centering behavior is of special interest. Centering behavior refers to the tendency (or strategy) of subjects to move their arm or mouse cursor to a position equidistant to all stimuli in the absence of predictive information, thereby reducing movement time to all possible targets. In the current study, we investigated sequential action in a virtual robotic agent trained using proximal policy optimization, a state-of-the-art deep reinforcement learning algorithm. The agent was trained to reach for appearing targets, similar to a serial response time task given to humans. We found that agents were more likely to develop centering behavior similar to human subjects after curricularized learning. In our curriculum, we first rewarded agents for reaching targets before introducing a penalty for energy expenditure. When the penalty was applied with no curriculum, many agents failed to learn the task due to a lack of action space exploration, resulting in high variability of agents' performance. Our findings suggest that in virtual agents, similar to infants, early energetic exploration can promote robust later learning. This may have the same effect as infants' curiosity-based learning by which they shape their own curriculum. However, introducing new goals cannot wait too long, as there may be critical periods in development after which agents (as humans) cannot flexibly learn to incorporate new objectives. These lessons are making their way into machine learning and offer exciting new avenues for studying both human and machine learning of sequential action.
引用
收藏
页码:311 / 326
页数:16
相关论文
共 50 条
  • [41] Fast Robot Hierarchical Exploration Based on Deep Reinforcement Learning
    Zuo, Shun
    Niu, Jianwei
    Ren, Lu
    Ouyang, Zhenchao
    2023 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2023, : 138 - 143
  • [42] Deep Reinforcement Learning Based Mobile Robot Navigation:A Review
    Kai Zhu
    Tao Zhang
    Tsinghua Science and Technology, 2021, 26 (05) : 674 - 691
  • [43] Robot Obstacle Avoidance Controller Based on Deep Reinforcement Learning
    Tang, Yaokun
    Chen, Qingyu
    Wei, Yuxin
    Journal of Sensors, 2022, 2022
  • [44] Automatic Curriculum Design for Object Transportation Based on Deep Reinforcement Learning
    Eoh, Gyuho
    Park, Tae-Hyoung
    IEEE ACCESS, 2021, 9 : 137281 - 137294
  • [45] Deep reinforcement learning-based robust missile guidance
    Ahn, Jeongsu
    Shin, Jongho
    Kim, Hyeong-Geun
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 927 - 930
  • [46] Deep Reinforcement Learning Based Robust Communication for Internet of Vehicles
    Automatic Control and Computer Sciences, 2023, 57 : 364 - 370
  • [47] Deep Reinforcement Learning Based Robust Communication for Internet of Vehicles
    Gasmi, Rim
    AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2023, 57 (04) : 364 - 370
  • [48] Path planning of robotic arm based on deep reinforcement learning algorithm
    Al-Gabalawy M.
    Advanced Control for Applications: Engineering and Industrial Systems, 2022, 4 (01):
  • [49] Deep reinforcement learning based magnet design for arm MRI system
    Pang, Yanwei
    Guo, Yishun
    Liu, Yiming
    Song, Zhanjie
    Wang, Zhenchang
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (03) : 2127 - 2138
  • [50] Control of Double Swing Arm Tracked Robot Based on Deep Reinforcement Learning in Various Uneven TerrainsControl of Double Swing Arm Tracked Robot Based on Deep...Z. Gao et al.
    Zhongye Gao
    Furao Shen
    Jian Zhao
    Neural Processing Letters, 57 (3)