Knowledge Transfer in Deep Reinforcement Learning via an RL-Specific GAN-Based Correspondence Function

被引:0
作者
Ruman, Marko [1 ]
Guy, Tatiana V. [1 ,2 ]
机构
[1] Czech Acad Sci, Inst Informat Theory & Automat, Dept Adapt Syst, Prague 18200, Czech Republic
[2] Czech Univ Life Sci, Fac Econ & Management, Dept Informat Engn, Prague 16500, Czech Republic
关键词
Training; Decision making; Games; Network architecture; Generative adversarial networks; Deep reinforcement learning; Knowledge transfer; Standards; Deep learning; Markov decision process; reinforcement learning; transfer learning; knowledge transfer;
D O I
10.1109/ACCESS.2024.3497589
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning has demonstrated superhuman performance in complex decision-making tasks, but it struggles with generalization and knowledge reuse-key aspects of true intelligence. This article introduces a novel approach that modifies Cycle Generative Adversarial Networks specifically for reinforcement learning, enabling effective one-to-one knowledge transfer between two tasks. Our method enhances the loss function with two new components: model loss, which captures dynamic relationships between source and target tasks, and Q-loss, which identifies states significantly influencing the target decision policy. Tested on the 2-D Atari game Pong, our method achieved 100% knowledge transfer in identical tasks and either 100% knowledge transfer or a 30% reduction in training time for a rotated task, depending on the network architecture. In contrast, using standard Generative Adversarial Networks or Cycle Generative Adversarial Networks led to worse performance than training from scratch in the majority of cases. The results demonstrate that the proposed method ensured enhanced knowledge generalization in deep reinforcement learning.
引用
收藏
页码:177204 / 177218
页数:15
相关论文
共 50 条
[31]   Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching [J].
Wang, Zhaodong ;
Qin, Zhiwei ;
Tang, Xiaocheng ;
Ye, Jieping ;
Zhu, Hongtu .
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, :617-626
[32]   RL-Routing: An SDN Routing Algorithm Based on Deep Reinforcement Learning [J].
Chen, Yi-Ren ;
Rezapour, Amir ;
Tzeng, Wen-Guey ;
Tsai, Shi-Chun .
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2020, 7 (04) :3185-3199
[33]   Survey of Deep Reinforcement Learning Based on Value Function and Policy Gradient [J].
Liu J.-W. ;
Gao F. ;
Luo X.-L. .
Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (06) :1406-1438
[34]   A Dynamic Financial Knowledge Graph Based on Reinforcement Learning and Transfer Learning [J].
Miao, Rui ;
Zhang, Xia ;
Yan, Hongfei ;
Chen, Chong .
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, :5370-5378
[35]   An improved reinforcement learning algorithm based on knowledge transfer and applications in autonomous vehicles [J].
Ding, Derui ;
Ding, Zifan ;
Wei, Guoliang ;
Han, Fei .
NEUROCOMPUTING, 2019, 361 :243-255
[36]   Global Ionospheric Total Electron Content Completion with a GAN-Based Deep Learning Framework [J].
Yang, Kunlin ;
Liu, Yang .
REMOTE SENSING, 2022, 14 (23)
[37]   A Multi-Task-Learning-Based Transfer Deep Reinforcement Learning Design for Autonomic Optical Networks [J].
Chen, Xiaoliang ;
Proietti, Roberto ;
Liu, Che-Yu ;
Yoo, S. J. Ben .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (09) :2878-2889
[38]   Energy Saving in Cellular Wireless Networks via Transfer Deep Reinforcement Learning [J].
Wu, Di ;
Xu, Yi Tian ;
Jenkin, Michael ;
Jang, Seowoo ;
Hossain, Ekram ;
Liu, Xue ;
Dudek, Gregory .
IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, :7019-7024
[39]   RL-Chord: CLSTM-Based Melody Harmonization Using Deep Reinforcement Learning [J].
Ji, Shulei ;
Yang, Xinyu ;
Luo, Jing ;
Li, Juan .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) :11128-11141
[40]   Reinforcement learning-based collision avoidance: impact of reward function and knowledge transfer [J].
Liu, Xiongqing ;
Jin, Yan .
AI EDAM-ARTIFICIAL INTELLIGENCE FOR ENGINEERING DESIGN ANALYSIS AND MANUFACTURING, 2020, 34 (02) :207-222