Multi-Agent Deep Reinforcement Learning Based Spectrum Allocation for D2D Underlay Communications

被引：124

作者：

Li, Zheng ^{[1
]}

Guo, Caili ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing 100876, Peoples R China

[2] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing Key Lab Network Syst Architecture & Conve, Beijing 100876, Peoples R China

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2020年 / 69卷 / 02期

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Device-to-device (D2D) communications; multi-agent deep reinforcement learning; spectrum allocation; RESOURCE-ALLOCATION; ALGORITHM; NETWORKS; SCHEME;

D O I：

10.1109/TVT.2019.2961405

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Device-to-device (D2D) communication underlay cellular networks is a promising technique to improve spectrum efficiency. In this situation, D2D transmission may cause severe interference to both the cellular and other D2D links, which imposes a great technical challenge to spectrum allocation. Existing centralized schemes require global information, which causes a large signaling overhead. While existing distributed schemes requires frequent information exchange among D2D users and cannot achieve global optimization. In this paper, a distributed spectrum allocation framework based on multi-agent deep reinforcement learning is proposed, named multi-agent actor critic (MAAC). MAAC shares global historical states, actions and policies during centralized training, requires no signal interaction during execution and utilizes cooperation among users to further optimize system performance. Moreover, in order to decrease the computing complexity of the training, we further propose the neighbor-agent actor critic (NAAC) based on the neighbor users' historical information for centralized training. The simulation results show that the proposed MAAC and NAAC can effectively reduce the outage probability of cellular links, greatly improve the sum rate of D2D links and converge quickly.

引用

页码：1828 / 1840

页数：13

共 42 条

[31] Artificial Intelligence-Based Techniques for Emerging Heterogeneous Network: State of the Arts, Opportunities, and Challenges [J].

Wang, Xiaofei ;

Li, Xiuhua ;

Leung, Victor C. M. .

IEEE ACCESS, 2015, 3 :1379-1391

[32]

WATKINS CJCH, 1992, MACH LEARN, V8, P279, DOI 10.1007/BF00992698

[33] Effectiveness of social marketing in improving knowledge, attitudes and practice of consumption of vitamin A-fortified oil in Tanzania [J].

Wu, Daphne Chen Nee ;

Corbett, Kitty ;

Horton, Susan ;

Saleh, Nadira ;

Mosha, Theobald C. E. .

PUBLIC HEALTH NUTRITION, 2019, 22 (03) :466-475

[34] Intelligent Resource Scheduling for 5G Radio Access Network Slicing [J].

Yan, Mu ;

Feng, Gang ;

Zhou, Jianhong ;

Sun, Yao ;

Liang, Ying-Chang .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (08) :7691-7703

[35] Intelligent Resource Management Based on Reinforcement Learning for Ultra-Reliable and Low-Latency IoV Communication Networks [J].

Yang, Helin ;

Xie, Xianzhong ;

Kadoch, Michel .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (05) :4157-4169

[36]

Ye H, 2018, IEEE ICC

[37] Deep Reinforcement Learning Based Resource Allocation for V2V Communications [J].

Ye, Hao ;

Li, Geoffrey Ye ;

Juang, Biing-Hwang Fred .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (04) :3163-3173

[38]

Ye H, 2018, INT WIREL COMMUN, P440, DOI 10.1109/IWCMC.2018.8450518

[39]

Zaki FW, 2017, NAT RADIO SCI CO, P284, DOI 10.1109/NRSC.2017.7893487

[40] A Deep-Learning-Based Radio Resource Assignment Technique for 5G Ultra Dense Networks [J].

Zhou, Yibo ;

Fadlullah, Zubair Md. ;

Mao, Bomin ;

Kato, Nei .

IEEE NETWORK, 2018, 32 (06) :28-34

← 1 2 3 4 5 →