Learning by reusing previous advice: a memory-based teacher-student framework

被引:1
|
作者
Zhu, Changxi [1 ]
Cai, Yi [1 ]
Hu, Shuyue [2 ]
Leung, Ho-fung [3 ]
Chiu, Dickson K. W. [4 ]
机构
[1] South China Univ Technol, Sch Software Engn, Guangzhou, Peoples R China
[2] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[4] Univ Hong Kong, Fac Educ, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Reinforcement learning; Multi-agent learning; Action advising; Teacher-student;
D O I
10.1007/s10458-022-09595-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement Learning (RL) has been widely used to solve sequential decision-making problems. However, it often suffers from slow learning speed in complex scenarios. Teacher-student frameworks address this issue by enabling agents to ask for and give advice so that a student agent can leverage the knowledge of a teacher agent to facilitate its learning. In this paper, we consider the effect of reusing previous advice, and propose a novel memory-based teacher-student framework such that student agents can memorize and reuse the previous advice from teacher agents. In particular, we propose two methods to decide whether previous advice should be reused: Q-Change per Step that reuses the advice if it leads to an increase in Q-values, and Decay Reusing Probability that reuses the advice with a decaying probability. The experiments on diverse RL tasks (Mario, Predator-Prey and Half Field Offense) confirm that our proposed framework significantly outperforms the existing frameworks in which previous advice is not reused.
引用
收藏
页数:30
相关论文
共 50 条
  • [21] PROGRESSIVE TEACHER-STUDENT TRAINING FRAMEWORK FOR MUSIC TAGGING
    Lu, Rui
    Zheng, Baigong
    Hai, Jiarui
    Tao, Fei
    Duan, Zhiyao
    Liu, Ji
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3129 - 3133
  • [22] Recognizing Microexpression as Macroexpression by the Teacher-student Framework Network
    Song, Yaqi
    Zhao, Wei
    Chen, Tong
    Li, Shigang
    Li, Jianfeng
    2022 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY ADJUNCT (ISMAR-ADJUNCT 2022), 2022, : 548 - 553
  • [23] LEARNING FROM THE BEST: A TEACHER-STUDENT MULTILINGUAL FRAMEWORK FOR LOW-RESOURCE LANGUAGES
    Bagchi, Deblin
    Hartmann, William
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6051 - 6055
  • [24] TEACHER-STUDENT INTERACTION IN DISTANCE LEARNING IN EMERGENCY SITUATIONS
    Sason, Hava
    Kellerman, Avichai
    JOURNAL OF INFORMATION TECHNOLOGY EDUCATION-RESEARCH, 2021, 20 : 479 - 501
  • [25] CREATIVE TEACHER-STUDENT LEARNING EXPERIENCES ABOUT CITY
    OCHOA, AS
    ALLEN, RF
    NATIONAL COUNCIL FOR THE SOCIAL STUDIES-YEARBOOK, 1972, : 89 - 157
  • [26] An Analytical Theory of Curriculum Learning in Teacher-Student Networks
    Saglietti, Luca
    Mannelli, Stefano Sarao
    Saxe, Andrew
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [27] Teacher-Student Learning for a Binary Perceptron with Quantum Fluctuations
    Arai, Shunta
    Ohzeki, Masayuki
    Tanaka, Kazuyuki
    JOURNAL OF THE PHYSICAL SOCIETY OF JAPAN, 2021, 90 (07)
  • [28] Research on Teacher-Student Interaction in M-learning
    Zhang, Xiaheng
    Wang, Dan
    EDUCATIONAL SCIENCES-THEORY & PRACTICE, 2018, 18 (05): : 1598 - 1603
  • [29] ATST: Audio Representation Learning with Teacher-Student Transformer
    Li, Xian
    Li, Xiaofei
    INTERSPEECH 2022, 2022, : 4172 - 4176
  • [30] An analytical theory of curriculum learning in teacher-student networks*
    Saglietti, Luca
    Mannelli, Stefano Sarao
    Saxe, Andrew
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (11):