Learning by reusing previous advice: a memory-based teacher-student framework

被引:1
|
作者
Zhu, Changxi [1 ]
Cai, Yi [1 ]
Hu, Shuyue [2 ]
Leung, Ho-fung [3 ]
Chiu, Dickson K. W. [4 ]
机构
[1] South China Univ Technol, Sch Software Engn, Guangzhou, Peoples R China
[2] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[4] Univ Hong Kong, Fac Educ, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Reinforcement learning; Multi-agent learning; Action advising; Teacher-student;
D O I
10.1007/s10458-022-09595-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement Learning (RL) has been widely used to solve sequential decision-making problems. However, it often suffers from slow learning speed in complex scenarios. Teacher-student frameworks address this issue by enabling agents to ask for and give advice so that a student agent can leverage the knowledge of a teacher agent to facilitate its learning. In this paper, we consider the effect of reusing previous advice, and propose a novel memory-based teacher-student framework such that student agents can memorize and reuse the previous advice from teacher agents. In particular, we propose two methods to decide whether previous advice should be reused: Q-Change per Step that reuses the advice if it leads to an increase in Q-values, and Decay Reusing Probability that reuses the advice with a decaying probability. The experiments on diverse RL tasks (Mario, Predator-Prey and Half Field Offense) confirm that our proposed framework significantly outperforms the existing frameworks in which previous advice is not reused.
引用
收藏
页数:30
相关论文
共 50 条
  • [41] Adversarial Teacher-Student Representation Learning for Domain Generalization
    Yang, Fu-En
    Cheng, Yuan-Chia
    Shiau, Zu-Yun
    Wang, Yu-Chiang Frank
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [42] Hybrid Learning with Teacher-student Knowledge Distillation for Recommenders
    Zhang, Hangbin
    Wong, Raymond K.
    Chu, Victor W.
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 227 - 235
  • [43] Motivation and engagement in mathematics: a qualitative framework for teacher-student interactions
    Durksen T.L.
    Way J.
    Bobis J.
    Anderson J.
    Skilling K.
    Martin A.J.
    Mathematics Education Research Journal, 2017, 29 (2) : 163 - 181
  • [44] A Multitask Teacher-Student Framework for Perceptual Audio Quality Assessment
    Wu, Chih-Wei
    Williams, Phillip A.
    Wolcott, William
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 396 - 400
  • [45] Interpretable Heterogeneous Teacher-Student Learning Framework for Hybrid-Supervised Pulmonary Nodule Detection
    Huang, Guangyu
    Yan, Yan
    Xue, Jing-Hao
    Zhu, Wentao
    Luo, Xiongbiao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 12100 - 12111
  • [46] A THEORY FOR MEMORY-BASED LEARNING
    LIN, JH
    VITTER, JS
    MACHINE LEARNING, 1994, 17 (2-3) : 143 - 167
  • [47] Enhancing teacher-student interactions and student online engagement in an online learning environment
    Ong, Sharmaine Gek Teng
    Quek, Gwendoline Choon Lang
    LEARNING ENVIRONMENTS RESEARCH, 2023, 26 (03) : 681 - 707
  • [48] Construction of the Teacher-Student Interaction Model in Online Learning Spaces
    Xie, Youru
    Huang, Yuling
    Bai, Yucheng
    Luo, Wenjing
    Qiu, Yi
    BLENDED LEARNING: RE-THINKING AND RE-DEFINING THE LEARNING PROCESS, ICBL 2021, 2021, 12830 : 53 - 65
  • [50] CTS: Concurrent Teacher-Student Reinforcement Learning for Legged Locomotion
    Wang, Hongxi
    Luo, Haoxiang
    Zhang, Wei
    Chen, Hua
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 9191 - 9198