Distributed Deep Deterministic Policy Gradient for Power Allocation Control in D2D-Based V2V Communications

被引：40

作者：

Khoi Khac Nguyen ^{[1
]}

Trung Q Duong ^{[1
]}

Ngo Anh Vien ^{[1
]}

Nhien-An Le-Khac ^{[2
]}

Long D Nguyen ^{[3
]}

机构：

[1] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT7 1NN, Antrim, North Ireland

[2] Univ Coll Dublin, Sch Comp Sci, Dublin, Ireland

[3] Duy Tan Univ, Da Nang 550000, Vietnam

来源：

IEEE ACCESS | 2019年 / 7卷

关键词：

Non-cooperative D2D communication; D2D-based V2V communications; power allocation; multi-agent deep reinforcement learning; and deep deterministic policy gradient (DDPG); RESOURCE-ALLOCATION; NETWORKS;

D O I：

10.1109/ACCESS.2019.2952411

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Device-to-device (D2D) communication is an emerging technology in the evolution of the 5G network enabled vehicle-to-vehicle (V2V) communications. It is a core technique for the next generation of many platforms and applications, e.g. real-time high-quality video streaming, virtual reality game, and smart city operation. However, the rapid proliferation of user devices and sensors leads to the need for more efficient resource allocation algorithms to enhance network performance while still capable of guaranteeing the quality-of-service. Currently, deep reinforcement learning is rising as a powerful tool to enable each node in the network to have a real-time self-organising ability. In this paper, we present two novel approaches based on deep deterministic policy gradient algorithm, namely "distributed deep deterministic policy gradient'' and "sharing deep deterministic policy gradient'', for the multi-agent power allocation problem in D2D-based V2V communications. Numerical results show that our proposed models outperform other deep reinforcement learning approaches in terms of the network's energy efficiency and fiexibility.

引用

页码：164533 / 164543

页数：11

共 36 条

[1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2] [Anonymous], IEEE ACCESS
[3] [Anonymous], EAI ENDORSED T IND N
[4] [Anonymous], 2018, EAI ENDORSED T IND N
[5] [Anonymous], 2016, ADV NEURAL INFORM PR
[6] [Anonymous], ARXIV190409673
[7] Deep Reinforcement Learning A brief survey
Arulkumaran, Kai
Deisenroth, Marc Peter
Brundage, Miles
Bharath, Anil Anthony
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) : 26 - 38
[8] A Survey on Device-to-Device Communication in Cellular Networks
Asadi, Arash
Wang, Qing
Mancuso, Vincenzo
[J]. IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2014, 16 (04): : 1801 - 1819
[9] Bertsekas D. P., 2005, Dynamic Programming and Optimal Control, V1
[10] Joint Resource Allocation and Power Control Algorithm for Cooperative D2D Heterogeneous Networks
Gao, Hongyuan
Zhang, Shibo
Su, Yumeng
Diao, Ming
[J]. IEEE ACCESS, 2019, 7 : 20632 - 20643

← 1 2 3 4 →