Hierarchical Meta-Reinforcement Learning for Resource-Efficient Slicing in O-RAN

被引:0
作者
Chen, Xianfu [1 ]
Wu, Celimuge [2 ]
Zhao, Zhifeng [3 ]
Xiao, Yong [4 ]
Mao, Shiwen [5 ]
Ji, Yusheng [6 ]
机构
[1] VTT Tech Res Ctr Finland Ltd, Oulu, Finland
[2] Univ Elect Commun, Tokyo, Japan
[3] Zhejiang Lab, Hangzhou, Peoples R China
[4] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
[5] Auburn Univ, Auburn, AL USA
[6] Natl Inst Informat, Tokyo, Japan
来源
IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM | 2023年
关键词
O-RAN; spectral efficiency; two-timescale optimization; hierarchical RL; meta-learning;
D O I
10.1109/GLOBECOM54140.2023.10437350
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Open radio access network (O-RAN) slicing allows the flexible control of network components and resources to satisfy the ever increasing demand of mobile applications. To optimize service provisioning, efficient management of limited radio resources is challenging due to the orchestration among network slices in the long-timescale and the slice configurations according to the mobile user (MU) statistics in the short-timescale. In this paper, we first propose a novel meta Markov decision process framework to mathematically formulate the problem of two-timescale radio resource management (RRM) in O-RAN slicing. The original RRM problem is then decoupled into a long-timescale master problem and a short-timescale subproblem, which are solved by a hierarchical reinforcement learning (RL) mechanism. Our proposed hierarchical RL mechanism includes a deep RL algorithm, solving the optimal long-timescale RRM policy, and a linear-decomposition based meta-RL algorithm, solving the optimal short-timescale RRM policy. Numerical experiments verify the theoretical analysis and show that our proposed hierarchical RL mechanism outperforms the most representative state-of-the-art baselines.
引用
收藏
页码:2729 / 2735
页数:7
相关论文
共 19 条
[1]   Elastic O-RAN Slicing for Industrial Monitoring and Control: A Distributed Matching Game and Deep Reinforcement Learning Approach [J].
Abedin, Sarder Fakhrul ;
Mahmood, Aamir ;
Tran, Nguyen H. ;
Han, Zhu ;
Gidlund, Mikael .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (10) :10808-10822
[2]   Age of Information Aware Radio Resource Management in Vehicular Networks: A Proactive Deep Reinforcement Learning Perspective [J].
Chen, Xianfu ;
Wu, Celimuge ;
Chen, Tao ;
Zhang, Honggang ;
Liu, Zhi ;
Zhang, Yan ;
Bennis, Mehdi .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (04) :2268-2281
[3]   Multi-Tenant Cross-Slice Resource Orchestration: A Deep Reinforcement Learning Approach [J].
Chen, Xianfu ;
Zhao, Zhifeng ;
Wu, Celimuge ;
Bennis, Mehdi ;
Liu, Hang ;
Ji, Yusheng ;
Zhang, Honggang .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2019, 37 (10) :2377-2392
[4]  
DOro S., 2022, P IEEE INFOCOM
[5]  
Fakoor R., 2020, P ICLR
[6]  
HAARNOJA T, 2018, P ICML STOCKHH SWED, V80
[7]   ROUND-ROBIN SCHEDULING FOR MAX MIN FAIRNESS IN DATA-NETWORKS [J].
HAHNE, EL .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1991, 9 (07) :1024-1039
[8]  
Lacava Andrea, 2023, IEEE Transactions on Mobile Computing
[9]   The LSTM-Based Advantage Actor-Critic Learning for Resource Management in Network Slicing With User Mobility [J].
Li, Rongpeng ;
Wang, Chujie ;
Zhao, Zhifeng ;
Guo, Rongbin ;
Zhang, Honggang .
IEEE COMMUNICATIONS LETTERS, 2020, 24 (09) :2005-2009
[10]   Deep Reinforcement Learning for Resource Management in Network Slicing [J].
Li, Rongpeng ;
Zhao, Zhifeng ;
Sun, Qi ;
I, Chih-Lin ;
Yang, Chenyang ;
Chen, Xianfu ;
Zhao, Minjian ;
Zhang, Honggang .
IEEE ACCESS, 2018, 6 :74429-74441