Distributed Deep Deterministic Policy Gradient for Power Allocation Control in D2D-Based V2V Communications

被引:40
作者
Khoi Khac Nguyen [1 ]
Trung Q Duong [1 ]
Ngo Anh Vien [1 ]
Nhien-An Le-Khac [2 ]
Long D Nguyen [3 ]
机构
[1] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT7 1NN, Antrim, North Ireland
[2] Univ Coll Dublin, Sch Comp Sci, Dublin, Ireland
[3] Duy Tan Univ, Da Nang 550000, Vietnam
关键词
Non-cooperative D2D communication; D2D-based V2V communications; power allocation; multi-agent deep reinforcement learning; and deep deterministic policy gradient (DDPG); RESOURCE-ALLOCATION; NETWORKS;
D O I
10.1109/ACCESS.2019.2952411
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Device-to-device (D2D) communication is an emerging technology in the evolution of the 5G network enabled vehicle-to-vehicle (V2V) communications. It is a core technique for the next generation of many platforms and applications, e.g. real-time high-quality video streaming, virtual reality game, and smart city operation. However, the rapid proliferation of user devices and sensors leads to the need for more efficient resource allocation algorithms to enhance network performance while still capable of guaranteeing the quality-of-service. Currently, deep reinforcement learning is rising as a powerful tool to enable each node in the network to have a real-time self-organising ability. In this paper, we present two novel approaches based on deep deterministic policy gradient algorithm, namely "distributed deep deterministic policy gradient'' and "sharing deep deterministic policy gradient'', for the multi-agent power allocation problem in D2D-based V2V communications. Numerical results show that our proposed models outperform other deep reinforcement learning approaches in terms of the network's energy efficiency and fiexibility.
引用
收藏
页码:164533 / 164543
页数:11
相关论文
共 36 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] [Anonymous], IEEE ACCESS
  • [3] [Anonymous], EAI ENDORSED T IND N
  • [4] [Anonymous], 2018, EAI ENDORSED T IND N
  • [5] [Anonymous], 2016, ADV NEURAL INFORM PR
  • [6] [Anonymous], ARXIV190409673
  • [7] Deep Reinforcement Learning A brief survey
    Arulkumaran, Kai
    Deisenroth, Marc Peter
    Brundage, Miles
    Bharath, Anil Anthony
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) : 26 - 38
  • [8] A Survey on Device-to-Device Communication in Cellular Networks
    Asadi, Arash
    Wang, Qing
    Mancuso, Vincenzo
    [J]. IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2014, 16 (04): : 1801 - 1819
  • [9] Bertsekas D. P., 2005, Dynamic Programming and Optimal Control, V1
  • [10] Joint Resource Allocation and Power Control Algorithm for Cooperative D2D Heterogeneous Networks
    Gao, Hongyuan
    Zhang, Shibo
    Su, Yumeng
    Diao, Ming
    [J]. IEEE ACCESS, 2019, 7 : 20632 - 20643