TPN:Triple network algorithm for deep reinforcement learning

被引:1
|
作者
Han, Chen [1 ]
Wang, Xuanyin [1 ]
机构
[1] Zhejiang Univ, Sch Mech Engn, Yuhangtang Rd 388, Hangzhou 310063, Zhejiang, Peoples R China
关键词
TPN; Deep reinforcement learning; Target net method; GAME; GO;
D O I
10.1016/j.neucom.2024.127755
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The target net method has been the foundation of deep reinforcement learning since Deepmind first proposed it in 2015. Almost all the current popular reinforcement learning algorithms include target net. However, while the slowly updated target network improves the stability of the algorithm, it also reduces the performance of the algorithm. In this paper, the authors design a novel triple-network algorithm(TPN). TPN combines the temporal-difference(TD) algorithm and policy gradient(PG) theorem. Using three networks to estimate the state value( u ), action value ( q ) , and policy( r ). These networks have no primary or secondary distinction but are trained synchronously and influence each other. The author found that through this TPN architecture, the convergence and stability of the algorithm can be greatly improved without increasing the amount of calculation. Although it is only a basic framework at present. The calculation process of TPN is simple and easy to implement. Experiments prove that the convergence speed and stability of TPN in discrete cases are better than PPO.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Wireless Virtual Network Embedding Algorithm Based on Deep Reinforcement Learning
    Gao, Qi
    Lyu, Na
    Miao, Jingcheng
    Pan, Wu
    ELECTRONICS, 2022, 11 (14)
  • [2] Fast Reconfiguration of Distribution Network Based on Deep Reinforcement Learning Algorithm
    Zhao, Bincheng
    Han, Xueshan
    Ma, Yiran
    Li, Zhiqi
    2020 5TH INTERNATIONAL CONFERENCE ON MATERIALS SCIENCE, ENERGY TECHNOLOGY AND ENVIRONMENTAL ENGINEERING, 2020, 571
  • [3] Network Planning with Deep Reinforcement Learning
    Zhu, Hang
    Gupta, Varun
    Ahuja, Satyajeet Singh
    Tian, Yuandong
    Zhang, Ying
    Jin, Xin
    SIGCOMM '21: PROCEEDINGS OF THE 2021 ACM SIGCOMM 2021 CONFERENCE, 2021, : 258 - 271
  • [4] Routing Optimization Algorithm under Deep Reinforcement Learning in Software Defined Network
    Xi, Qi
    Zhang, Xiang
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (12): : 3431 - 3449
  • [5] A wavelength routing algorithm for optical satellite network based on deep reinforcement learning
    Li X.
    Li Y.
    Zhao S.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2023, 45 (01): : 264 - 270
  • [6] INTELLIGENT PREDICTION OF NETWORK SECURITY SITUATIONS BASED ON DEEP REINFORCEMENT LEARNING ALGORITHM
    Lu, Yan
    Kuang, Yunxin
    Yang, Qiufen
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2024, 25 (01): : 147 - 155
  • [7] A Cognitive Relay Network Throughput Optimization Algorithm Based on Deep Reinforcement Learning
    Liu, Shaojiang
    Hu, Kejing
    Ni, Weichuan
    Xu, Zhiming
    Wang, Feng
    Wan, Zhiping
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2019, 2019
  • [8] Deep Reinforcement Learning with Dual Targeting Algorithm
    Kodama, Naoki
    Harada, Taku
    Miyazaki, Kazuteru
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [9] An Intelligent SDWN Routing Algorithm Based on Network Situational Awareness and Deep Reinforcement Learning
    Li, Jinqiang
    Ye, Miao
    Huang, Linqiang
    Deng, Xiaofang
    Qiu, Hongbing
    Wang, Yong
    Jiang, Qiuxiang
    IEEE ACCESS, 2023, 11 : 83322 - 83342
  • [10] Deep Reinforcement Learning Enabled Network Routing Optimization Approach with an Enhanced DDPG Algorithm
    Meng, Lingyu
    Yang, Wen
    Guo, Bingli
    Huang, Shanguo
    2020 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP) AND INTERNATIONAL CONFERENCE ON INFORMATION PHOTONICS AND OPTICAL COMMUNICATIONS (IPOC), 2020,