Deep Reinforcement Learning for Joint Channel Selection and Power Control in D2D Networks

被引：90

作者：

Tan, Junjie ^{[1
]}

Liang, Ying-Chang ^{[1
]}

Zhang, Lin ^{[2
]}

Feng, Gang ^{[2
]}

机构：

[1] Univ Elect Sci & Technol China UESTC, Ctr Intelligent Networking & Commun CINC, Chengdu 611731, Peoples R China

[2] Univ Elect Sci & Technol China UESTC, Natl Key Lab Commun, Chengdu 611731, Peoples R China

来源：

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS | 2021年 / 20卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Device-to-device communication; Power control; Wireless communication; Correlation; Reinforcement learning; Scalability; Optimization; Device-to-device (D2D); channel selection; power control; deep reinforcement learning (DRL); fractional programming (FP); TO-DEVICE COMMUNICATIONS; ALLOCATION ALGORITHM; RESOURCE-ALLOCATION; SPECTRUM; SYSTEMS; ACCESS; MODE; LTE;

D O I：

10.1109/TWC.2020.3032991

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Device-to-device (D2D) technology, which allows direct communications between proximal devices, is widely acknowledged as a promising candidate to alleviate the mobile traffic explosion problem. In this paper, we consider an overlay D2D network, in which multiple D2D pairs coexist on several orthogonal spectrum bands, i.e., channels. Due to spectrum scarcity, the number of D2D pairs is typically more than that of available channels, and thus multiple D2D pairs may use a single channel simultaneously. This may lead to severe co-channel interference and degrade network performance. To deal with this issue, we formulate a joint channel selection and power control optimization problem, with the aim to maximize the weighted-sum-rate (WSR) of the D2D network. Unfortunately, this problem is non-convex and NP-hard. To solve this problem, we first adopt the state-of-art fractional programming (FP) technique and develop an FP-based algorithm to obtain a near-optimal solution. However, the FP-based algorithm requires instantaneous global channel state information (CSI) for centralized processing, resulting in poor scalability and prohibitively high signalling overheads. Therefore, we further propose a distributed deep reinforcement learning (DRL)-based scheme, with which D2D pairs can autonomously optimize channel selection and transmit power by only exploiting local information and outdated nonlocal information. Compared with the FP-based algorithm, the DRL-based scheme can achieve better scalability and reduce signalling overheads significantly. Simulation results demonstrate that even without instantaneous global CSI, the performance of the DRL-based scheme can approach closely to that of the FP-based algorithm.

引用

页码：1363 / 1378

页数：16

共 39 条

[1] Distributed Power Allocation for D2D Communications Underlaying/Overlaying OFDMA Cellular Networks [J].

Abrardo, Andrea ;

Moretti, Marco .

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2017, 16 (03) :1466-1479

[2] A Survey on Device-to-Device Communication in Cellular Networks [J].

Asadi, Arash ;

Wang, Qing ;

Mancuso, Vincenzo .

IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2014, 16 (04) :1801-1819

[3] QoS-Oriented Mode, Spectrum, and Power Allocation for D2D Communication Underlaying LTE-A Network [J].

Asheralieva, Alia ;

Miyanaga, Yoshikazu .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2016, 65 (12) :9787-9800

[4] Interference Management for Cellular-Connected UAVs: A Deep Reinforcement Learning Approach [J].

Challita, Ursula ;

Saad, Walid ;

Bettstetter, Christian .

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2019, 18 (04) :2125-2140

[5] Mode Switching for Energy-Efficient Device-to-Device Communications in Cellular Networks [J].

Feng, Daquan ;

Yu, Guanding ;

Xiong, Cong ;

Yi Yuan-Wu ;

Li, Geoffrey Ye ;

Feng, Gang ;

Li, Shaoqian .

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2015, 14 (12) :6993-7003

[6] Device-to-Device Communications in Cellular Networks [J].

Feng, Daquan ;

Lu, Lu ;

Yi Yuan-Wu ;

Li, Geoffrey Ye ;

Li, Shaoqian ;

Feng, Gang .

IEEE COMMUNICATIONS MAGAZINE, 2014, 52 (04) :49-55

[7] Design Aspects of Network Assisted Device-to-Device Communications [J].

Fodor, Gabor ;

Dahlman, Erik ;

Mildh, Gunnar ;

Parkvall, Stefan ;

Reider, Norbert ;

Miklos, Gyorgy ;

Turanyi, Zoltan .

IEEE COMMUNICATIONS MAGAZINE, 2012, 50 (03) :170-177

[8] Deep-Reinforcement-Learning-Based Optimization for Cache-Enabled Opportunistic Interference Alignment Wireless Networks [J].

He, Ying ;

Zhang, Zheng ;

Yu, F. Richard ;

Zhao, Nan ;

Yin, Hongxi ;

Leung, Victor C. M. ;

Zhang, Yanhua .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2017, 66 (11) :10433-10445

[9] Energy-Efficient Joint Resource Allocation and Power Control for D2D Communications [J].

Jiang, Yanxiang ;

Liu, Qiang ;

Zheng, Fuchun ;

Gao, Xiqi ;

You, Xiaohu .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2016, 65 (08) :6119-6127

[10] Does Frequent Low Resolution Feedback Outperform Infrequent High Resolution Feedback for Multiple Antenna Beamforming Systems? [J].

Kim, Taejoon ;

Love, David J. ;

Clerckx, Bruno .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2011, 59 (04) :1654-1669

← 1 2 3 4 →