Multi-Task Deep Reinforcement Learning for Terahertz NOMA Resource Allocation With Hybrid Discrete and Continuous Actions

被引:2
|
作者
Hu, Zhifeng [1 ]
Han, Chong [1 ,2 ,3 ]
Deng, Yansha [4 ]
Wang, Xudong [5 ]
机构
[1] Shanghai Jiao Tong Univ, Terahertz Wireless Commun TWC Lab, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Engn, Shanghai 200240, Peoples R China
[3] Shanghai Jiao Tong Univ, Cooperat Medianet Innovat Ctr CMIC, Shanghai 200240, Peoples R China
[4] Kings Coll London, Dept Engn, London WC2R 2LS, England
[5] Shanghai Jiao Tong Univ, Univ Michigan Shanghai Jiao Tong Univ UM SJTU Join, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Resource management; NOMA; Throughput; Terahertz communications; Wireless communication; Multitasking; Hybrid power systems; Deep reinforcement learning (DRL); non-orthogonal multiple access (NOMA); Terahertz (THz) networks; NONORTHOGONAL MULTIPLE-ACCESS; POWER ALLOCATION; JOINT POWER; MIMO-NOMA; SYSTEMS; NETWORKS; COMMUNICATION; INTERFERENCE; CHALLENGES; CAPACITY;
D O I
10.1109/TVT.2024.3381238
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Terahertz (THz) non-orthogonal multiple access (NOMA) networks have great potential for next-generation wireless communications, by providing promising ultra-high data rates and user fairness. In THz-NOMA networks, efficient and effective long-term beamforming-bandwidth-power (BBP) allocation is yet an open problem due to its non-deterministic polynomial-time hard (NP-hard) nature. In this article, the continuous property of power and sub-arrays ratios assignment and the discrete property of sub-bands allocation are carefully treated. In light of these attributes, an offline hybrid discrete and continuous actions (DISCO) multi-task deep reinforcement learning (DRL) algorithm is proposed to maximize the long-term throughput. Specifically, the deployment of multi-task learning enables the actor of DISCO to smartly integrate two state-of-the-art DRL algorithms, e.g., actor-critic (AC) that only selects discrete actions and deep deterministic policy gradient (DDPG) that only generates continuous actions. Rigorous theoretical derivations for the neural network design and backpropagation process are provided to tailor our proposed DISCO for the BBP problem. Compared to the benchmark no-learning and conventional DRL algorithms, DISCO enhances the network throughput, while achieving good fairness among users. Furthermore, DISCO consumes hundred-of-millisecond computational time, revealing the practicability of DISCO.
引用
收藏
页码:11647 / 11663
页数:17
相关论文
共 50 条
  • [21] Deep Reinforcement Learning based Dynamic Resource Allocation Method for NOMA in AeroMACS
    Yu, Lanchenhui
    Zhao, Jingjing
    Zhu, Yanbo
    Chen, RunZe
    Cai, Kaiquan
    2024 INTEGRATED COMMUNICATIONS, NAVIGATION AND SURVEILLANCE CONFERENCE, ICNS, 2024,
  • [22] Multi-task Deep Reinforcement Learning for IoT Service Selection
    Matsuoka, Hiroki
    Moustafa, Ahmed
    ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 548 - 554
  • [23] Multi-task Deep Reinforcement Learning: a Combination of Rainbow and DisTraL
    Andalibi, Milad
    Setoodeh, Peyman
    Mansourieh, Ali
    Asemani, Mohammad Hassan
    2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
  • [24] PiCor: Multi-Task Deep Reinforcement Learning with Policy Correction
    Bai, Fengshuo
    Zhang, Hongming
    Tao, Tianyang
    Wu, Zhiheng
    Wang, Yanna
    Xu, Bo
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6728 - 6736
  • [25] Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
    Oh, Junhyuk
    Singh, Satinder
    Lee, Honglak
    Kohli, Pushmeet
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [26] Deep reinforcement learning-based task scheduling and resource allocation for NOMA-MEC in Industrial Internet of Things
    Lin, Lixia
    Zhou, Wen'an
    Yang, Zhicheng
    Liu, Jianlong
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2023, 16 (01) : 170 - 188
  • [27] Deep reinforcement learning-based task scheduling and resource allocation for NOMA-MEC in Industrial Internet of Things
    Lixia Lin
    Wen’an Zhou
    Zhicheng Yang
    Jianlong Liu
    Peer-to-Peer Networking and Applications, 2023, 16 : 170 - 188
  • [28] Multi-task reinforcement learning in humans
    Momchil S. Tomov
    Eric Schulz
    Samuel J. Gershman
    Nature Human Behaviour, 2021, 5 : 764 - 773
  • [29] Multi-Task Reinforcement Learning for Quadrotors
    Xing, Jiaxu
    Geles, Ismail
    Song, Yunlong
    Aljalbout, Elie
    Scaramuzza, Davide
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (03): : 2112 - 2119
  • [30] Multi-task reinforcement learning in humans
    Tomov, Momchil S.
    Schulz, Eric
    Gershman, Samuel J.
    NATURE HUMAN BEHAVIOUR, 2021, 5 (06) : 764 - +