A Critical Period for Robust Curriculum-Based Deep Reinforcement Learning of Sequential Action in a Robot Arm

被引:4
|
作者
de Kleijn, Roy [1 ]
Sen, Deniz [2 ]
Kachergis, George [3 ]
机构
[1] Leiden Univ, Leiden Inst Brain & Cognit, Leiden, Netherlands
[2] Leiden Univ, Math Inst, Leiden, Netherlands
[3] Stanford Univ, Language & Cognit Lab, Stanford, CA 94305 USA
关键词
Curriculum learning; Movement optimization; Reinforcement learning; Robotic arm control; Sequential action; GENERATION;
D O I
10.1111/tops.12595
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Many everyday activities are sequential in nature. That is, they can be seen as a sequence of subactions and sometimes subgoals. In the motor execution of sequential action, context effects are observed in which later subactions modulate the execution of earlier subactions (e.g., reaching for an overturned mug, people will optimize their grasp to achieve a comfortable end state). A trajectory (movement) adaptation of an often-used paradigm in the study of sequential action, the serial response time task, showed several context effects of which centering behavior is of special interest. Centering behavior refers to the tendency (or strategy) of subjects to move their arm or mouse cursor to a position equidistant to all stimuli in the absence of predictive information, thereby reducing movement time to all possible targets. In the current study, we investigated sequential action in a virtual robotic agent trained using proximal policy optimization, a state-of-the-art deep reinforcement learning algorithm. The agent was trained to reach for appearing targets, similar to a serial response time task given to humans. We found that agents were more likely to develop centering behavior similar to human subjects after curricularized learning. In our curriculum, we first rewarded agents for reaching targets before introducing a penalty for energy expenditure. When the penalty was applied with no curriculum, many agents failed to learn the task due to a lack of action space exploration, resulting in high variability of agents' performance. Our findings suggest that in virtual agents, similar to infants, early energetic exploration can promote robust later learning. This may have the same effect as infants' curiosity-based learning by which they shape their own curriculum. However, introducing new goals cannot wait too long, as there may be critical periods in development after which agents (as humans) cannot flexibly learn to incorporate new objectives. These lessons are making their way into machine learning and offer exciting new avenues for studying both human and machine learning of sequential action.
引用
收藏
页码:311 / 326
页数:16
相关论文
共 50 条
  • [31] DIFFICULTY METRICS STUDY FOR CURRICULUM-BASED DEEP LEARNING IN THE CONTEXT OF STROKE LESION SEGMENTATION
    Moreau, Juliette
    Mechtouff, Laura
    Rousseau, David
    Cho, Tae-Hee
    Eker, Omer
    Berthezene, Yves
    Frindel, Carole
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [32] A hierarchical deep reinforcement learning algorithm for typing with a dual-arm humanoid robot
    Baltes, Jacky
    Mandala, Hanjaya
    Saeedvand, Saeed
    KNOWLEDGE ENGINEERING REVIEW, 2024, 39
  • [33] Sim-Real Mapping of an Image-Based Robot Arm Controller Using Deep Reinforcement Learning
    Sasaki, Minoru
    Muguro, Joseph
    Kitano, Fumiya
    Njeri, Waweru
    Matsushita, Kojiro
    APPLIED SCIENCES-BASEL, 2022, 12 (20):
  • [34] Deep Reinforcement Learning Based on Curriculum Learning for Drone Swarm Area Defense
    Sun, Miaoping
    Yang, Zequan
    Dai, Xunhua
    Nian, Xiaohong
    Wang, Haibo
    Xiong, Hongyun
    PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 1119 - 1128
  • [35] Deep Reinforcement Learning Based Mobile Robot Navigation: A Review
    Zhu, Kai
    Zhang, Tao
    TSINGHUA SCIENCE AND TECHNOLOGY, 2021, 26 (05) : 674 - 691
  • [36] Robot Obstacle Avoidance Controller Based on Deep Reinforcement Learning
    Tang, Yaokun
    Chen, Qingyu
    Wei, Yuxin
    JOURNAL OF SENSORS, 2022, 2022
  • [37] Robot Navigation with Interaction-based Deep Reinforcement Learning
    Zhai, Yu
    Miao, Yanzi
    Wang, Hesheng
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE-ROBIO 2021), 2021, : 1974 - 1979
  • [38] Towards Deep Reinforcement Learning based Chinese Calligraphy Robot
    Wu, Ruiqi
    Fang, Wubing
    Chao, Fei
    Gao, Xingen
    Zhou, Changle
    Yang, Longzhi
    Lin, Chih-Min
    Shang, Changjing
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2018, : 507 - 512
  • [39] Research on path planning of robot based on deep reinforcement learning
    Liu, Feng
    Chen, Chang
    Li, Zhihua
    Guan, Zhi-Hong
    Wang, Hua O.
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 3730 - 3734
  • [40] Tracking control for mobile robot based on deep reinforcement learning
    Zhang Shansi
    Wang Weiming
    2019 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT AUTONOMOUS SYSTEMS (ICOIAS 2019), 2019, : 155 - 160