Multi-Task Deep Reinforcement Learning for Terahertz NOMA Resource Allocation With Hybrid Discrete and Continuous Actions

被引：2

作者：

Hu, Zhifeng ^{[1
]}

Han, Chong ^{[1
,2
,3
]}

Deng, Yansha ^{[4
]}

Wang, Xudong ^{[5
]}

机构：

[1] Shanghai Jiao Tong Univ, Terahertz Wireless Commun TWC Lab, Shanghai 200240, Peoples R China

[2] Shanghai Jiao Tong Univ, Dept Engn, Shanghai 200240, Peoples R China

[3] Shanghai Jiao Tong Univ, Cooperat Medianet Innovat Ctr CMIC, Shanghai 200240, Peoples R China

[4] Kings Coll London, Dept Engn, London WC2R 2LS, England

[5] Shanghai Jiao Tong Univ, Univ Michigan Shanghai Jiao Tong Univ UM SJTU Join, Shanghai 200240, Peoples R China

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2024年 / 73卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Resource management; NOMA; Throughput; Terahertz communications; Wireless communication; Multitasking; Hybrid power systems; Deep reinforcement learning (DRL); non-orthogonal multiple access (NOMA); Terahertz (THz) networks; NONORTHOGONAL MULTIPLE-ACCESS; POWER ALLOCATION; JOINT POWER; MIMO-NOMA; SYSTEMS; NETWORKS; COMMUNICATION; INTERFERENCE; CHALLENGES; CAPACITY;

D O I：

10.1109/TVT.2024.3381238

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Terahertz (THz) non-orthogonal multiple access (NOMA) networks have great potential for next-generation wireless communications, by providing promising ultra-high data rates and user fairness. In THz-NOMA networks, efficient and effective long-term beamforming-bandwidth-power (BBP) allocation is yet an open problem due to its non-deterministic polynomial-time hard (NP-hard) nature. In this article, the continuous property of power and sub-arrays ratios assignment and the discrete property of sub-bands allocation are carefully treated. In light of these attributes, an offline hybrid discrete and continuous actions (DISCO) multi-task deep reinforcement learning (DRL) algorithm is proposed to maximize the long-term throughput. Specifically, the deployment of multi-task learning enables the actor of DISCO to smartly integrate two state-of-the-art DRL algorithms, e.g., actor-critic (AC) that only selects discrete actions and deep deterministic policy gradient (DDPG) that only generates continuous actions. Rigorous theoretical derivations for the neural network design and backpropagation process are provided to tailor our proposed DISCO for the BBP problem. Compared to the benchmark no-learning and conventional DRL algorithms, DISCO enhances the network throughput, while achieving good fairness among users. Furthermore, DISCO consumes hundred-of-millisecond computational time, revealing the practicability of DISCO.

引用

页码：11647 / 11663

页数：17

共 50 条

[21] Deep Reinforcement Learning based Dynamic Resource Allocation Method for NOMA in AeroMACS
Yu, Lanchenhui
Zhao, Jingjing
Zhu, Yanbo
Chen, RunZe
Cai, Kaiquan
2024 INTEGRATED COMMUNICATIONS, NAVIGATION AND SURVEILLANCE CONFERENCE, ICNS, 2024,
[22] Multi-task Deep Reinforcement Learning for IoT Service Selection
Matsuoka, Hiroki
Moustafa, Ahmed
ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 548 - 554
[23] Multi-task Deep Reinforcement Learning: a Combination of Rainbow and DisTraL
Andalibi, Milad
Setoodeh, Peyman
Mansourieh, Ali
Asemani, Mohammad Hassan
2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
[24] PiCor: Multi-Task Deep Reinforcement Learning with Policy Correction
Bai, Fengshuo
Zhang, Hongming
Tao, Tianyang
Wu, Zhiheng
Wang, Yanna
Xu, Bo
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6728 - 6736
[25] Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
Oh, Junhyuk
Singh, Satinder
Lee, Honglak
Kohli, Pushmeet
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[26] Deep reinforcement learning-based task scheduling and resource allocation for NOMA-MEC in Industrial Internet of Things
Lin, Lixia
Zhou, Wen'an
Yang, Zhicheng
Liu, Jianlong
PEER-TO-PEER NETWORKING AND APPLICATIONS, 2023, 16 (01) : 170 - 188
[27] Deep reinforcement learning-based task scheduling and resource allocation for NOMA-MEC in Industrial Internet of Things
Lixia Lin
Wen’an Zhou
Zhicheng Yang
Jianlong Liu
Peer-to-Peer Networking and Applications, 2023, 16 : 170 - 188
[28] Multi-task reinforcement learning in humans
Momchil S. Tomov
Eric Schulz
Samuel J. Gershman
Nature Human Behaviour, 2021, 5 : 764 - 773
[29] Multi-Task Reinforcement Learning for Quadrotors
Xing, Jiaxu
Geles, Ismail
Song, Yunlong
Aljalbout, Elie
Scaramuzza, Davide
IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (03): : 2112 - 2119
[30] Multi-task reinforcement learning in humans
Tomov, Momchil S.
Schulz, Eric
Gershman, Samuel J.
NATURE HUMAN BEHAVIOUR, 2021, 5 (06) : 764 - +

← 1 2 3 4 5 →