Hierarchical Meta-Reinforcement Learning for Resource-Efficient Slicing in O-RAN

被引：0

作者：

Chen, Xianfu ^{[1
]}

Wu, Celimuge ^{[2
]}

Zhao, Zhifeng ^{[3
]}

Xiao, Yong ^{[4
]}

Mao, Shiwen ^{[5
]}

Ji, Yusheng ^{[6
]}

机构：

[1] VTT Tech Res Ctr Finland Ltd, Oulu, Finland

[2] Univ Elect Commun, Tokyo, Japan

[3] Zhejiang Lab, Hangzhou, Peoples R China

[4] Huazhong Univ Sci & Technol, Wuhan, Peoples R China

[5] Auburn Univ, Auburn, AL USA

[6] Natl Inst Informat, Tokyo, Japan

来源：

IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM | 2023年

关键词：

O-RAN; spectral efficiency; two-timescale optimization; hierarchical RL; meta-learning;

D O I：

10.1109/GLOBECOM54140.2023.10437350

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Open radio access network (O-RAN) slicing allows the flexible control of network components and resources to satisfy the ever increasing demand of mobile applications. To optimize service provisioning, efficient management of limited radio resources is challenging due to the orchestration among network slices in the long-timescale and the slice configurations according to the mobile user (MU) statistics in the short-timescale. In this paper, we first propose a novel meta Markov decision process framework to mathematically formulate the problem of two-timescale radio resource management (RRM) in O-RAN slicing. The original RRM problem is then decoupled into a long-timescale master problem and a short-timescale subproblem, which are solved by a hierarchical reinforcement learning (RL) mechanism. Our proposed hierarchical RL mechanism includes a deep RL algorithm, solving the optimal long-timescale RRM policy, and a linear-decomposition based meta-RL algorithm, solving the optimal short-timescale RRM policy. Numerical experiments verify the theoretical analysis and show that our proposed hierarchical RL mechanism outperforms the most representative state-of-the-art baselines.

引用

页码：2729 / 2735

页数：7

共 19 条

[1] Elastic O-RAN Slicing for Industrial Monitoring and Control: A Distributed Matching Game and Deep Reinforcement Learning Approach [J].