Dynamic Spectrum Access for D2D-Enabled Internet of Things: A Deep Reinforcement Learning Approach

被引：14

作者：

Huang, Jingfei ^{[1
,2
]}

Yang, Yang ^{[1
,2
]}

Gao, Zhen ^{[3
,4
]}

He, Dazhong ^{[1
,2
]}

Ng, Derrick Wing Kwan ^{[5
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China

[2] Beijing Univ Posts & Telecommun, Ctr Data Sci, Beijing 100876, Peoples R China

[3] Southeast Univ, Natl Mobile Commun Res Lab, Nanjing 211189, Jiangsu, Peoples R China

[4] Beijing Inst Technol, Adv Res Inst Multidisciplinary Sci, Beijing 100081, Peoples R China

[5] Univ New South Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2025, Australia

来源：

IEEE INTERNET OF THINGS JOURNAL | 2022年 / 9卷 / 18期

基金：

北京市自然科学基金; 中国国家自然科学基金;

关键词：

Device-to-device (D2D) communication; deep reinforcement learning (DRL); dynamic spectrum access; Internet of Things (IoT); RESOURCE-ALLOCATION; COMMUNICATION; SELECTION; NETWORKS;

D O I：

10.1109/JIOT.2022.3160197

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Device-to-device (D2D) communication is regarded as a promising technology to support spectral-efficient Internet of Things (IoT) in beyond fifth-generation (5G) and sixth-generation (6G) networks. This article investigates the spectrum access problem for D2D-assisted cellular networks based on deep reinforcement learning (DRL), which can be applied to both the uplink and downlink scenarios. Specifically, we consider a time-slotted cellular network, where D2D nodes share the cellular spectrum resources (CUEs) with cellular users in a time-splitting manner. Besides, D2D nodes could reuse time slots preoccupied by CUEs according to a location-based spectrum access (LSA) strategy on the premise of cellular communication quality. The key challenge lies in that D2D nodes have no information on the LSA strategy and the access principle of CUEs. Thus, we design a DRL-based spectrum access scheme such that the D2D nodes can autonomously acquire an optimal strategy for efficient spectrum access without any prior knowledge to achieve a specific objective such as maximizing the normalized sum throughput. Moreover, we adopt a generalized double deep Q-network (DDQN) algorithm and extend the objective function to explore the resource allocation fairness for D2D nodes. The proposed scheme is evaluated under various conditions and our simulation results show that it can achieve the near-optimal throughput performance with different objectives compared to the benchmark, which is the theoretical throughput upper bound derived from a genius-aided scheme with complete system knowledge available.

引用

页码：17793 / 17807

页数：15

共 36 条

[1]

[Anonymous], 2014, Rep. 36.843

[2] A Survey on Device-to-Device Communication in Cellular Networks [J].

Asadi, Arash ;

Wang, Qing ;

Mancuso, Vincenzo .

IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2014, 16 (04) :1801-1819

[3] Distributive Dynamic Spectrum Access Through Deep Reinforcement Learning: A Reservoir Computing-Based Approach [J].

Chang, Hao-Hsuan ;

Song, Hao ;

Yi, Yang ;

Zhang, Jianzhong ;

He, Haibo ;

Liu, Lingjia .

IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (02) :1938-1948

[4] Learning-Based Constraint Satisfaction With Sensing Restrictions [J].

Checco, Alessandro ;

Leith, Douglas J. .

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2013, 7 (05) :811-820

[5]

Chen XM, 2021, IEEE J SEL AREA COMM, V39, P615, DOI 10.1109/JSAC.2020.3019724

[6] Dynamic Spectrum Sharing for Hybrid Access in OFDMA-Based Cognitive Femtocell Networks [J].

Deng, Qingyong ;

Li, Zhetao ;

Chen, Jiabei ;

Zeng, Fanzi ;

Wang, Hui-Ming ;

Zhou, Liang ;

Choi, Young-June .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2018, 67 (11) :10830-10840

[7] Resource Allocation for Vehicular Communications With Low Latency and High Reliability [J].

Guo, Chongtao ;

Liang, Le ;

Li, Geoffrey Ye .

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2019, 18 (08) :3887-3902

[8] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[9] Distributed Resource Allocation for D2D Communications Underlay Cellular Networks [J].

Hoang-Hiep Nguyen ;

Hasegawa, Mikio ;

Hwang, Won-Joo .

IEEE COMMUNICATIONS LETTERS, 2016, 20 (05) :942-945

[10] Deep Reinforcement Learning-Based Dynamic Spectrum Access for D2D Communication Underlay Cellular Networks [J].

Huang, Jingfei ;

Yang, Yang ;

He, Gang ;

Xiao, Yang ;

Liu, Jun .

IEEE COMMUNICATIONS LETTERS, 2021, 25 (08) :2614-2618

← 1 2 3 4 →