Knowledge-Collaboration-Based Resource Allocation in 6G IoT: A Graph Attention RL Approach

被引：4

作者：

Huang, Zhongwei ^{[1
,2
]}

Yu, Fei Richard ^{[3
]}

Cai, Jun ^{[2
]}

机构：

[1] Macau Univ Sci & Technol, Sch Comp Sci & Engn, Macau, Peoples R China

[2] Guangdong Polytech Normal Univ, Sch Cyber Secur, Guangzhou 510665, Peoples R China

[3] Carleton Univ, Sch Informat Technol, Ottawa, ON K1S 5B6, Canada

来源：

IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 22期

基金：

中国国家自然科学基金;

关键词：

Training; Collaboration; 6G mobile communication; Resource management; Heuristic algorithms; Servers; Computational modeling; Collective reinforcement learning (CRL); graph attention (GAT); knowledge collaboration; resource allocation; MECHANISM; INTERNET; 5G;

D O I：

10.1109/JIOT.2024.3416054

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In future 6G-enabled Internet of Things (IoT), users and devices will be divided into numerous distributed domains with smaller base station coverage due to the utilization of terahertz high-frequency band communication. Deep reinforcement learning (DRL) agents will be increasingly deployed in the domain to achieve intelligent service provisioning and resource allocation. However, the existing DRL-based method faces the problem of repeated model training and poor generalization ability when service demand fluctuates and environmental changes occur. In addition, limited training samples in each domain also lead to insufficient model training. Inspired by the collaborative learning of human knowledge, we propose a knowledge collaboration-based resource allocation mechanism for future 6G-enabled IoT and address two basic issues: 1) which agent should collaborate with and 2) how to collaborate. Specifically, we first model the distributed network as a graph and use graph attention (GAT) to capture the fluctuant service demands and time-varying resource capacities in temporal and spatial domains, and then calculate the similarity between the agents. We further propose a collective reinforcement learning (CRL) algorithm that facilitates knowledge collaboration between the agents through the policy distribution. Simulation results verify that the proposed GAT-CRL achieves fast convergence as deep deterministic policy gradient (DDPG) in 4K steps, computing the similarity score more accurately with the increasing attention heads, and achieves higher successful flow than the soft actor-critic (about 3.6%-5.4%) and DDPG (about 14.6%-21%) when adapting to unseen traffic patterns/loads and increasing topology scales.

引用

页码：36581 / 36595

页数：15

共 50 条

[1] A Survey of Machine and Deep Learning Methods for Internet of Things (IoT) Security [J].