Learning by reusing previous advice: a memory-based teacher-student framework

被引:1
|
作者
Zhu, Changxi [1 ]
Cai, Yi [1 ]
Hu, Shuyue [2 ]
Leung, Ho-fung [3 ]
Chiu, Dickson K. W. [4 ]
机构
[1] South China Univ Technol, Sch Software Engn, Guangzhou, Peoples R China
[2] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[4] Univ Hong Kong, Fac Educ, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Reinforcement learning; Multi-agent learning; Action advising; Teacher-student;
D O I
10.1007/s10458-022-09595-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement Learning (RL) has been widely used to solve sequential decision-making problems. However, it often suffers from slow learning speed in complex scenarios. Teacher-student frameworks address this issue by enabling agents to ask for and give advice so that a student agent can leverage the knowledge of a teacher agent to facilitate its learning. In this paper, we consider the effect of reusing previous advice, and propose a novel memory-based teacher-student framework such that student agents can memorize and reuse the previous advice from teacher agents. In particular, we propose two methods to decide whether previous advice should be reused: Q-Change per Step that reuses the advice if it leads to an increase in Q-values, and Decay Reusing Probability that reuses the advice with a decaying probability. The experiments on diverse RL tasks (Mario, Predator-Prey and Half Field Offense) confirm that our proposed framework significantly outperforms the existing frameworks in which previous advice is not reused.
引用
收藏
页数:30
相关论文
共 50 条
  • [31] Scaffolding student learning: A micro-analysis of teacher-student interaction
    van de Pol, Janneke
    Elbers, Ed
    LEARNING CULTURE AND SOCIAL INTERACTION, 2013, 2 (01) : 32 - 41
  • [33] Learning about the world of the student: writing poetry for teacher-student understanding
    Issitt, John
    Issitt, Margaret
    EDUCATION 3-13, 2010, 38 (01) : 101 - 109
  • [34] Improving Policy Generalization for Teacher-Student Reinforcement Learning
    Xudong, Gong
    Hongda, Jia
    Xing, Zhou
    Dawei, Feng
    Bo, Ding
    Jie, Xu
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2020), PT II, 2020, 12275 : 39 - 47
  • [35] ADVERSARIAL TEACHER-STUDENT LEARNING FOR UNSUPERVISED DOMAIN ADAPTATION
    Meng, Zhong
    Li, Jinyu
    Gong, Yifan
    Juang, Biing-Hwang
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5949 - 5953
  • [36] The Effects of Reciprocal Imitation on Teacher-Student Relationships and Student Learning Outcomes
    Zhou, Jiangyuan
    MIND BRAIN AND EDUCATION, 2012, 6 (02) : 66 - 73
  • [37] Teacher-student neural coupling during teaching and learning
    Nguyen, Mai
    Chang, Ashley
    Micciche, Emily
    Meshulam, Meir
    Nastase, Samuel A.
    Hasson, Uri
    SOCIAL COGNITIVE AND AFFECTIVE NEUROSCIENCE, 2022, 17 (04) : 367 - 376
  • [38] Progressive Teacher-student Learning for Early Action Prediction
    Wang, Xionghui
    Hu, Jian-Fang
    Lai, Jianhuang
    Zhang, Jianguo
    Zheng, Wei-Shi
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3551 - 3560
  • [39] Academic evaluation based on the teacher-student relationship
    Ribeiro, RMS
    Ribeiro, FS
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1996, 31 (3-4) : 45480 - 45480
  • [40] The Role of Teacher-Student Interpersonal Relations in Flipped Learning on Student Engagement
    Li, Ruiguang
    FRONTIERS IN PSYCHOLOGY, 2021, 12