Deep Reinforcement Learning for Distributed Dynamic MISO Downlink-Beamforming Coordination

被引:60
作者
Ge, Jungang [1 ]
Liang, Ying-Chang [1 ]
Joung, Jingon [2 ]
Sun, Sumei [3 ]
机构
[1] Univ Elect Sci & Technol China UESTC, Ctr Intelligent Networking & Commun CINC, Chengdu 611731, Peoples R China
[2] Chung Ang Univ, Sch Elect & Elect Engn, Seoul 06974, South Korea
[3] Inst Infocomm Res, Singapore 138632, Singapore
基金
新加坡国家研究基金会; 中国国家自然科学基金; 国家重点研发计划;
关键词
Downlink; Intercell interference; Wireless communication; MISO communication; Data communication; Reinforcement learning; Downlink-beamforming coordination; multi-input single-output (MISO) interference channel; deep reinforcement learning; interference mitigation; SUM-RATE MAXIMIZATION; POWER-CONTROL; NETWORKS; SYSTEMS;
D O I
10.1109/TCOMM.2020.3004524
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We consider a homogeneous cellular network where a multi-antenna base station (BS) in each cell transmits messages to its intended user over a common frequency band. To improve the system capacity of this multi-cell multi-input single-output (MISO) interference channel, one of the state-of-the-art algorithms, namely, downlink-beamforming coordination, allows all BSs to cooperate with one another to mitigate the effect of inter-cell interference. However, most existing algorithms are suboptimal and impractical in a dynamic wireless environment, due to the high computational complexity and the overhead involved in collecting global channel state information (CSI). In this study, we exploit deep reinforcement learning (DRL) and propose a distributed dynamic downlink-beamforming coordination (DDBC) method with partial observability of the CSI. Each BS is able to train its own deep Q-network and employs appropriate beamformer depending on its environment, which is observed through a designed limited-information exchange protocol. The simulation results show that the proposed DRL-based DDBC method, with a considerably lower system overhead, achieves a system capacity that is very close to that of the fractional programming algorithm with global and instantaneous CSI measurements. In addition, this work demonstrates the potential of utilizing DRL to solve DDBC problems in a more practical manner.
引用
收藏
页码:6070 / 6085
页数:16
相关论文
共 40 条
[1]  
[Anonymous], 2015, Deep learn. nat., DOI [10.1038/nature14539, DOI 10.1038/NATURE14539]
[2]  
[Anonymous], 2017, ARXIV170709183
[3]  
Cybenko G., 1989, Mathematics of Control, Signals, and Systems, V2, P303, DOI 10.1007/BF02551274
[4]   Coordinated Beamforming for the Multicell Multi-Antenna Wireless System [J].
Dahrouj, Hayssam ;
Yu, Wei .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2010, 9 (05) :1748-1759
[5]  
Dai H., 2004, EURASIP J WIREL COMM, V2004, P222
[6]   Downlink capacity of interference-limited MIMO systems with joint detection [J].
Dai, HY ;
Molisch, AF ;
Poor, HV .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2004, 3 (02) :442-453
[7]   Optimal insertion of pilot symbols for transmissions over time-varying flat fading channels [J].
Dong, M ;
Tong, L ;
Sadler, BM .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (05) :1403-1418
[8]   Multi-Cell MIMO Cooperative Networks: A New Look at Interference [J].
Gesbert, David ;
Hanly, Stephen ;
Huang, Howard ;
Shitz, Shlomo Shamai ;
Simeone, Osvaldo ;
Yu, Wei .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2010, 28 (09) :1380-1408
[9]  
Goodfellow I, 2016, Deep Learning, V1st
[10]  
Han GA, 2017, INT CONF ACOUST SPEE, P2087, DOI 10.1109/ICASSP.2017.7952524