Learning by reusing previous advice: a memory-based teacher-student framework

被引：1

作者：

Zhu, Changxi ^{[1
]}

Cai, Yi ^{[1
]}

Hu, Shuyue ^{[2
]}

Leung, Ho-fung ^{[3
]}

Chiu, Dickson K. W. ^{[4
]}

机构：

[1] South China Univ Technol, Sch Software Engn, Guangzhou, Peoples R China

[2] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China

[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[4] Univ Hong Kong, Fac Educ, Hong Kong, Peoples R China

来源：

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS | 2023年 / 37卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Reinforcement learning; Multi-agent learning; Action advising; Teacher-student;

D O I：

10.1007/s10458-022-09595-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement Learning (RL) has been widely used to solve sequential decision-making problems. However, it often suffers from slow learning speed in complex scenarios. Teacher-student frameworks address this issue by enabling agents to ask for and give advice so that a student agent can leverage the knowledge of a teacher agent to facilitate its learning. In this paper, we consider the effect of reusing previous advice, and propose a novel memory-based teacher-student framework such that student agents can memorize and reuse the previous advice from teacher agents. In particular, we propose two methods to decide whether previous advice should be reused: Q-Change per Step that reuses the advice if it leads to an increase in Q-values, and Decay Reusing Probability that reuses the advice with a decaying probability. The experiments on diverse RL tasks (Mario, Predator-Prey and Half Field Offense) confirm that our proposed framework significantly outperforms the existing frameworks in which previous advice is not reused.

引用

页数：30

共 50 条

[21] PROGRESSIVE TEACHER-STUDENT TRAINING FRAMEWORK FOR MUSIC TAGGING
Lu, Rui
Zheng, Baigong
Hai, Jiarui
Tao, Fei
Duan, Zhiyao
Liu, Ji
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3129 - 3133
[22] Recognizing Microexpression as Macroexpression by the Teacher-student Framework Network
Song, Yaqi
Zhao, Wei
Chen, Tong
Li, Shigang
Li, Jianfeng
2022 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY ADJUNCT (ISMAR-ADJUNCT 2022), 2022, : 548 - 553
[23] LEARNING FROM THE BEST: A TEACHER-STUDENT MULTILINGUAL FRAMEWORK FOR LOW-RESOURCE LANGUAGES
Bagchi, Deblin
Hartmann, William
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6051 - 6055
[24] TEACHER-STUDENT INTERACTION IN DISTANCE LEARNING IN EMERGENCY SITUATIONS
Sason, Hava
Kellerman, Avichai
JOURNAL OF INFORMATION TECHNOLOGY EDUCATION-RESEARCH, 2021, 20 : 479 - 501
[25] CREATIVE TEACHER-STUDENT LEARNING EXPERIENCES ABOUT CITY
OCHOA, AS
ALLEN, RF
NATIONAL COUNCIL FOR THE SOCIAL STUDIES-YEARBOOK, 1972, : 89 - 157
[26] An Analytical Theory of Curriculum Learning in Teacher-Student Networks
Saglietti, Luca
Mannelli, Stefano Sarao
Saxe, Andrew
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[27] Teacher-Student Learning for a Binary Perceptron with Quantum Fluctuations
Arai, Shunta
Ohzeki, Masayuki
Tanaka, Kazuyuki
JOURNAL OF THE PHYSICAL SOCIETY OF JAPAN, 2021, 90 (07)
[28] Research on Teacher-Student Interaction in M-learning
Zhang, Xiaheng
Wang, Dan
EDUCATIONAL SCIENCES-THEORY & PRACTICE, 2018, 18 (05): : 1598 - 1603
[29] ATST: Audio Representation Learning with Teacher-Student Transformer
Li, Xian
Li, Xiaofei
INTERSPEECH 2022, 2022, : 4172 - 4176
[30] An analytical theory of curriculum learning in teacher-student networks*
Saglietti, Luca
Mannelli, Stefano Sarao
Saxe, Andrew
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (11):

← 1 2 3 4 5 →