Collaborative Deep Reinforcement Learning for Resource Optimization in Non-Terrestrial Networks

被引:1
作者
Cao, Yang [1 ,2 ]
Lien, Shao-Yu [3 ]
Liang, Ying-Chang [1 ,2 ]
Niyato, Dusit [4 ]
Shen, Xuemin [5 ]
机构
[1] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Huzhou, Huzhou, Peoples R China
[2] Univ Elect Sci & Technol China, Chengdu, Peoples R China
[3] Natl Yang Ming Chiao Tung Univ, Tainan, Taiwan
[4] Nanyang Technol Univ, Singapore, Singapore
[5] Univ Waterloo, Waterloo, ON, Canada
来源
2023 IEEE 34TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, PIMRC | 2023年
基金
国家重点研发计划; 新加坡国家研究基金会;
关键词
Non-terrestrial networks (NTNs); earth-fixed cell; resource allocation; deep reinforcement learning (DRL); multi-time-scale; Markov decision process (MMDPs);
D O I
10.1109/PIMRC56721.2023.10294047
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Non-terrestrial networks (NTNs) with low-earth orbit (LEO) satellites have been regarded as promising remedies to support global ubiquitous wireless services. Due to the rapid mobility of LEO satellite, inter-beam/satellite handovers happen frequently for a specific user equipment (UE). To tackle this issue, earth-fixed cell scenarios have been under studied, in which the LEO satellite adjusts its beam direction towards a fixed area within its dwell duration, to maintain stable transmission performance for the UE. Therefore, it is required that the LEO satellite performs real-time resource allocation, which however is unaffordable by the LEO satellite with limited computing capability. To address this issue, in this paper, we propose a two-time-scale collaborative deep reinforcement learning (DRL) scheme for beam management and resource allocation in NTNs, in which LEO satellite and UE with different control cycles update their decision-making policies through a sequential manner. Specifically, UE updates its policy subject to improving the value functions of both the agents. Furthermore, the LEO satellite only makes decisions through finitestep rollouts with a reference decision trajectory received from the UE. Simulation results show that the proposed scheme can effectively balance the throughput performance and computational complexity over traditional greedy-searching schemes.
引用
收藏
页数:7
相关论文
共 12 条
  • [1] [Anonymous], 2021, 3GPP TR 38.821 v16.1.0
  • [2] Multiagent Reinforcement Learning: Rollout and Policy Iteration
    Bertsekas, Dimitri
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2021, 8 (02) : 249 - 272
  • [3] Downlink Transmit Design for Massive MIMO LEO Satellite Communications
    Li, Ke-Xin
    You, Li
    Wang, Jiaheng
    Gao, Xiqi
    Tsinos, Christos G.
    Chatzinotas, Symeon
    Ottersten, Bjorn
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2022, 70 (02) : 1014 - 1028
  • [4] Distributed Intelligence: A Verification for Multi-Agent DRL-Based Multibeam Satellite Resource Allocation
    Liao, Xianglai
    Hu, Xin
    Liu, Zhijun
    Ma, Shijun
    Xu, Lexi
    Wang, Weidong
    Ghannouchi, Fadhel M.
    Li, Xiuhua
    [J]. IEEE COMMUNICATIONS LETTERS, 2020, 24 (12) : 2785 - 2789
  • [5] Palacios J., 2021, arXiv
  • [6] Ground-Assisted Federated Learning in LEO Satellite Constellations
    Razmi, Nasrin
    Matthiesen, Bho
    Dekorsy, Armin
    Popovski, Petar
    [J]. IEEE WIRELESS COMMUNICATIONS LETTERS, 2022, 11 (04) : 717 - 721
  • [7] Beamspace MIMO for Satellite Swarms
    Roeper, Maik
    Matthiesen, Bho
    Wuebben, Dirk
    Popovski, Petar
    Dekorsy, Armin
    [J]. 2022 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2022, : 1307 - 1312
  • [8] Schulman J, 2015, PR MACH LEARN RES, V37, P1889
  • [9] Utility function-Based TOPSIS for Network Interface Selection in Heterogeneous Wireless Networks
    Senouci, Mohamed Abdelkrim
    Hoceini, Said
    Mellouk, Abdelhamid
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2016,
  • [10] BROADBAND LEO SATELLITE COMMUNICATIONS: ARCHITECTURES AND KEY TECHNOLOGIES
    Su, Yongtao
    Liu, Yaoqi
    Zhou, Yiqing
    Yuan, Jinhong
    Cao, Huan
    Shi, Jinglin
    [J]. IEEE WIRELESS COMMUNICATIONS, 2019, 26 (02) : 55 - 61