Knowledge Transfer in Deep Reinforcement Learning via an RL-Specific GAN-Based Correspondence Function

被引:0
作者
Ruman, Marko [1 ]
Guy, Tatiana V. [1 ,2 ]
机构
[1] Czech Acad Sci, Inst Informat Theory & Automat, Dept Adapt Syst, Prague 18200, Czech Republic
[2] Czech Univ Life Sci, Fac Econ & Management, Dept Informat Engn, Prague 16500, Czech Republic
关键词
Training; Decision making; Games; Network architecture; Generative adversarial networks; Deep reinforcement learning; Knowledge transfer; Standards; Deep learning; Markov decision process; reinforcement learning; transfer learning; knowledge transfer;
D O I
10.1109/ACCESS.2024.3497589
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning has demonstrated superhuman performance in complex decision-making tasks, but it struggles with generalization and knowledge reuse-key aspects of true intelligence. This article introduces a novel approach that modifies Cycle Generative Adversarial Networks specifically for reinforcement learning, enabling effective one-to-one knowledge transfer between two tasks. Our method enhances the loss function with two new components: model loss, which captures dynamic relationships between source and target tasks, and Q-loss, which identifies states significantly influencing the target decision policy. Tested on the 2-D Atari game Pong, our method achieved 100% knowledge transfer in identical tasks and either 100% knowledge transfer or a 30% reduction in training time for a rotated task, depending on the network architecture. In contrast, using standard Generative Adversarial Networks or Cycle Generative Adversarial Networks led to worse performance than training from scratch in the majority of cases. The results demonstrate that the proposed method ensured enhanced knowledge generalization in deep reinforcement learning.
引用
收藏
页码:177204 / 177218
页数:15
相关论文
共 50 条
[41]   Evaluating the stealth of reinforcement learning-based cyber attacks against unknown scenarios using knowledge transfer techniques [J].
Horta Neto, Antonio Jose ;
dos Santos, Anderson Fernandes Pereira ;
Goldschmidt, Ronaldo Ribeiro .
JOURNAL OF COMPUTER SECURITY, 2025, 33 (02) :100-115
[42]   Additional Look into GAN-based Augmentation for Deep Learning COVID-19 Image Classification [J].
Fedoruk O. ;
Klimaszewski K. ;
Ogonowski A. ;
Kruk M. .
Machine Graphics and Vision, 2023, 32 (3-4) :108-124
[43]   RL-VAEGAN: Adversarial defense for reinforcement learning agents via style transfer [J].
Hu, Yueyue ;
Sun, Shiliang .
KNOWLEDGE-BASED SYSTEMS, 2021, 221
[44]   Knowledge-Driven Backdoor Removal in Deep Neural Networks via Reinforcement Learning [J].
Song, Jiayin ;
Li, Yike ;
Tian, Yunzhe ;
Wu, Xingyu ;
Li, Qiong ;
Tong, Endong ;
Niu, Wenjia ;
Zhang, Zhenguo ;
Liu, Jiqiang .
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT III, KSEM 2024, 2024, 14886 :336-348
[45]   Radiotherapy Dose Optimization via Clinical Knowledge Based Reinforcement Learning [J].
Dubois, Paul ;
Cournede, Paul-Henry ;
Paragios, Nikos ;
Fenoglietto, Pascal .
ARTIFICIAL INTELLIGENCE IN MEDICINE, PT I, AIME 2024, 2024, 14844 :151-160
[46]   A transfer learning method for electric vehicles charging strategy based on deep reinforcement learning [J].
Wang, Kang ;
Wang, Haixin ;
Yang, Zihao ;
Feng, Jiawei ;
Li, Yanzhen ;
Yang, Junyou ;
Chen, Zhe .
APPLIED ENERGY, 2023, 343
[47]   Session-based Interactive Recommendation via Deep Reinforcement Learning [J].
Shi, Longxiang ;
Zhang, Zilin ;
Wang, Shoujin ;
Zhang, Qi ;
Wu, Minghui ;
Yang, Cheng ;
Li, Shijian .
23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, :1319-1324
[48]   Attention-Based Highway Safety Planner for Autonomous Driving via Deep Reinforcement Learning [J].
Chen, Guoxi ;
Zhang, Ya ;
Li, Xinde .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (01) :162-175
[49]   Stochastic Economic Lot Scheduling via Self-Attention Based Deep Reinforcement Learning [J].
Song, Wen ;
Mi, Nan ;
Li, Qiqiang ;
Zhuang, Jing ;
Cao, Zhiguang .
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (02) :1457-1468
[50]   Autonomous Navigation via Deep Reinforcement Learning for Resource Constraint Edge Nodes Using Transfer Learning [J].
Anwar, Aqeel ;
Raychowdhury, Arijit .
IEEE ACCESS, 2020, 8 :26549-26560