Learning by reusing previous advice: a memory-based teacher-student framework

被引：1

作者：

Zhu, Changxi ^{[1
]}

Cai, Yi ^{[1
]}

Hu, Shuyue ^{[2
]}

Leung, Ho-fung ^{[3
]}

Chiu, Dickson K. W. ^{[4
]}

机构：

[1] South China Univ Technol, Sch Software Engn, Guangzhou, Peoples R China

[2] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China

[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[4] Univ Hong Kong, Fac Educ, Hong Kong, Peoples R China

来源：

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS | 2023年 / 37卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Reinforcement learning; Multi-agent learning; Action advising; Teacher-student;

D O I：

10.1007/s10458-022-09595-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement Learning (RL) has been widely used to solve sequential decision-making problems. However, it often suffers from slow learning speed in complex scenarios. Teacher-student frameworks address this issue by enabling agents to ask for and give advice so that a student agent can leverage the knowledge of a teacher agent to facilitate its learning. In this paper, we consider the effect of reusing previous advice, and propose a novel memory-based teacher-student framework such that student agents can memorize and reuse the previous advice from teacher agents. In particular, we propose two methods to decide whether previous advice should be reused: Q-Change per Step that reuses the advice if it leads to an increase in Q-values, and Decay Reusing Probability that reuses the advice with a decaying probability. The experiments on diverse RL tasks (Mario, Predator-Prey and Half Field Offense) confirm that our proposed framework significantly outperforms the existing frameworks in which previous advice is not reused.

引用

页数：30

共 50 条

[1] Learning by reusing previous advice: a memory-based teacher–student framework
Changxi Zhu
Yi Cai
Shuyue Hu
Ho-fung Leung
Dickson K. W. Chiu
Autonomous Agents and Multi-Agent Systems, 2023, 37
[2] A GNN-based teacher-student framework with multi-advice
Lei, Yunjiao
Ye, Dayong
Zhu, Congcong
Shen, Sheng
Zhou, Wanlei
Zhu, Tianqing
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 250
[3] Reinforcement Learning with Teacher-student Framework In Future Market
Chen, Sihang
Luo, Weiqi
Yu, Chao
INTERNATIONAL CONFERENCE ON ALGORITHMS, HIGH PERFORMANCE COMPUTING, AND ARTIFICIAL INTELLIGENCE (AHPCAI 2021), 2021, 12156
[4] Traffic signal control using reinforcement learning based on the teacher-student framework
Liu, Junxiu
Qin, Sheng
Su, Min
Luo, Yuling
Zhang, Shunsheng
Wang, Yanhu
Yang, Su
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 228
[5] A federated advisory teacher-student framework with simultaneous learning agents
Lei, Yunjiao
Ye, Dayong
Zhu, Tianqing
Zhou, Wanlei
KNOWLEDGE-BASED SYSTEMS, 2024, 305
[6] Teacher-Student Curriculum Learning
Matiisen, Tambet
Oliver, Avital
Cohen, Taco
Schulman, John
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (09) : 3732 - 3740
[7] CONDITIONAL TEACHER-STUDENT LEARNING
Meng, Zhong
Li, Jinyu
Zhao, Yong
Gong, Yifan
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6445 - 6449
[8] A Framework for Motivating Teacher-Student Relationships
Robinson, Carly D.
EDUCATIONAL PSYCHOLOGY REVIEW, 2022, 34 (04) : 2061 - 2094
[9] A Framework for Motivating Teacher-Student Relationships
Carly D. Robinson
Educational Psychology Review, 2022, 34 : 2061 - 2094
[10] A Teacher-Student Markov Decision Process-based Framework for Online Correctional Learning
Lourenco, Ines
Winqvist, Rebecka
Rojas, Cristian R.
Wahlberg, Bo
2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 3456 - 3461

← 1 2 3 4 5 →