Hierarchical Reinforcement Learning for Relay Selection and Power Optimization in Two-Hop Cooperative Relay Network

被引:25
作者
Geng, Yuanzhe [1 ]
Liu, Erwu [1 ]
Wang, Rui [1 ,2 ]
Liu, Yiming [1 ]
机构
[1] Tongji Univ, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
[2] Tongji Univ, Shanghai Inst Intelligent Sci & Technol, Shanghai 201804, Peoples R China
基金
美国国家科学基金会;
关键词
Relays; Resource management; Probability; Power system reliability; Optimization; Signal to noise ratio; Relay networks (telecommunication); Cooperative communication; outage probability; relay selection; power allocation; hierarchical reinforcement learning; RESOURCE-ALLOCATION; DIVERSITY;
D O I
10.1109/TCOMM.2021.3119689
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we study the outage probability minimizing problem in a two-hop cooperative relay network. To reduce outage probability, existing studies propose many schemes for relay selection and power allocation, which are usually based on the assumption of exact channel state information (CSI). However, it is difficult to obtain perfect instantaneous CSI in practical situations where channel states change rapidly, and thus traditional methods would not perform well. Considering these factors, we turn to the emerging reinforcement learning (RL) methods for solutions. RL methods do not need any prior knowledge of CSI, but use neural network for approximation and decision after interacting with communication environment. Nevertheless, conventional RL methods, including most deep reinforcement learning (DRL) methods, cannot perform well when the search space is too large. In addition, non-stationarity is a common problem when using hierarchical reinforcement learning (HRL), which is caused by the changing behavior in different hierarchies. Therefore, we first propose a DRL framework with an outage-based reward function, which is then used as a baseline. Then, we further design an HRL framework and training algorithm. By decomposing relay selection and power allocation into two hierarchical optimization objectives, and combining on- policy and off-policy methods in the HRL framework, our method successfully address the sparse reward and non-stationary problem. Simulation results reveal that compared with traditional DRL method, the proposed HRL training algorithm can converge faster and reduce the outage probability by 8% in two-hop relay network with the same outage threshold.
引用
收藏
页码:171 / 184
页数:14
相关论文
共 43 条
  • [1] On the Performance Analysis of Multirelay Cooperative Diversity Systems With Channel Estimation Errors
    Amin, Osama
    Ikki, Salama Said
    Uysal, Murat
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2011, 60 (05) : 2050 - 2059
  • [2] Statistical channel knowledge-based optimum power allocation for relaying protocols in the high SNR regime
    Annavajjala, Ramesh
    Cosman, Pamela C.
    Milstein, Laurence B.
    [J]. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2007, 25 (02) : 292 - 305
  • [3] Joint Power and Time Allocation for Two-Way Cooperative NOMA
    Bae, Jimin
    Han, Youngnam
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (12) : 12443 - 12447
  • [4] A simple cooperative diversity method based on network path selection
    Bletsas, A
    Khisti, A
    Reed, DP
    Lippman, A
    [J]. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2006, 24 (03) : 659 - 672
  • [5] Multihop diversity in wireless relaying channels
    Boyer, J
    Falconer, DD
    Yanikomeroglu, H
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2004, 52 (10) : 1820 - 1830
  • [6] Joint Noisy Network Coding and Decode-Forward Relaying for Non-Orthogonal Multiple Access
    Chattha, Jawwad Nasar
    Uppal, Momin
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2019, 18 (01) : 296 - 309
  • [7] Chen Z., 2018, DECENTRALIZED COMPUT
  • [9] Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning
    Dilokthanakul, Nat
    Kaplanis, Christos
    Pawlowski, Nick
    Shanahan, Murray
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (11) : 3409 - 3418
  • [10] Buffer-Aided Max-Link Relay Selection for Multi-Way Cooperative Multi-Antenna Systems
    Duarte, F. L.
    de Lamare, R. C.
    [J]. IEEE COMMUNICATIONS LETTERS, 2019, 23 (08) : 1423 - 1426