Learning by reusing previous advice: a memory-based teacher-student framework

被引：1

作者：

Zhu, Changxi ^{[1
]}

Cai, Yi ^{[1
]}

Hu, Shuyue ^{[2
]}

Leung, Ho-fung ^{[3
]}

Chiu, Dickson K. W. ^{[4
]}

机构：

[1] South China Univ Technol, Sch Software Engn, Guangzhou, Peoples R China

[2] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China

[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[4] Univ Hong Kong, Fac Educ, Hong Kong, Peoples R China

来源：

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS | 2023年 / 37卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Reinforcement learning; Multi-agent learning; Action advising; Teacher-student;

D O I：

10.1007/s10458-022-09595-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement Learning (RL) has been widely used to solve sequential decision-making problems. However, it often suffers from slow learning speed in complex scenarios. Teacher-student frameworks address this issue by enabling agents to ask for and give advice so that a student agent can leverage the knowledge of a teacher agent to facilitate its learning. In this paper, we consider the effect of reusing previous advice, and propose a novel memory-based teacher-student framework such that student agents can memorize and reuse the previous advice from teacher agents. In particular, we propose two methods to decide whether previous advice should be reused: Q-Change per Step that reuses the advice if it leads to an increase in Q-values, and Decay Reusing Probability that reuses the advice with a decaying probability. The experiments on diverse RL tasks (Mario, Predator-Prey and Half Field Offense) confirm that our proposed framework significantly outperforms the existing frameworks in which previous advice is not reused.

引用

页数：30

共 50 条

[41] Adversarial Teacher-Student Representation Learning for Domain Generalization
Yang, Fu-En
Cheng, Yuan-Chia
Shiau, Zu-Yun
Wang, Yu-Chiang Frank
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[42] Hybrid Learning with Teacher-student Knowledge Distillation for Recommenders
Zhang, Hangbin
Wong, Raymond K.
Chu, Victor W.
20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 227 - 235
[43] Motivation and engagement in mathematics: a qualitative framework for teacher-student interactions
Durksen T.L.
Way J.
Bobis J.
Anderson J.
Skilling K.
Martin A.J.
Mathematics Education Research Journal, 2017, 29 (2) : 163 - 181
[44] A Multitask Teacher-Student Framework for Perceptual Audio Quality Assessment
Wu, Chih-Wei
Williams, Phillip A.
Wolcott, William
29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 396 - 400
[45] Interpretable Heterogeneous Teacher-Student Learning Framework for Hybrid-Supervised Pulmonary Nodule Detection
Huang, Guangyu
Yan, Yan
Xue, Jing-Hao
Zhu, Wentao
Luo, Xiongbiao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 12100 - 12111
[46] A THEORY FOR MEMORY-BASED LEARNING
LIN, JH
VITTER, JS
MACHINE LEARNING, 1994, 17 (2-3) : 143 - 167
[47] Enhancing teacher-student interactions and student online engagement in an online learning environment
Ong, Sharmaine Gek Teng
Quek, Gwendoline Choon Lang
LEARNING ENVIRONMENTS RESEARCH, 2023, 26 (03) : 681 - 707
[48] Construction of the Teacher-Student Interaction Model in Online Learning Spaces
Xie, Youru
Huang, Yuling
Bai, Yucheng
Luo, Wenjing
Qiu, Yi
BLENDED LEARNING: RE-THINKING AND RE-DEFINING THE LEARNING PROCESS, ICBL 2021, 2021, 12830 : 53 - 65
[49] The influence of teacher-student relationships and feedback on students' engagement with learning
Plater, Mark
INTERNATIONAL JOURNAL OF CHILDRENS SPIRITUALITY, 2018, 23 (03) : 340 - 342
[50] CTS: Concurrent Teacher-Student Reinforcement Learning for Legged Locomotion
Wang, Hongxi
Luo, Haoxiang
Zhang, Wei
Chen, Hua
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 9191 - 9198

← 1 2 3 4 5 →