GNN-Based Hierarchical Deep Reinforcement Learning for NFV-Oriented Online Resource Orchestration in Elastic Optical DCIs

被引：45

作者：

Li, Baojia ^{[1
]}

Zhu, Zuqing ^{[1
]}

机构：

[1] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230027, Anhui, Peoples R China

来源：

JOURNAL OF LIGHTWAVE TECHNOLOGY | 2022年 / 40卷 / 04期

基金：

国家重点研发计划;

关键词：

Virtualization; Training; Optical fiber networks; Artificial neural networks; Adaptation models; Topology; Optical interconnections; Network function virtualization (NFV); service function chain; datacenter interconnection (DCI); elastic optical network (EON); graph neural network (GNN); deep reinforcement learning (DRL); network automation; SPECTRUM ASSIGNMENT; MODULATION; ALLOCATION; BACKUP;

D O I：

10.1109/JLT.2021.3125974

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Network function virtualization (NFV) in elastic optical datacenter interconnections (EO-DCIs) enables flexible and timely deployment of network services. However, as the service provisioning of virtual network function service chains (vNF-SCs) in an EO-DCI needs to orchestrate the allocations of IT resources in datacenters (DCs) and spectrum resources on fiber links dynamically, it is a complex and challenging problem. In this work, we model the problem as a Markov decision process (MDP), and propose a hierarchical deep reinforcement learning (DRL) model based on graph neural network (GNN), namely, HRLOrch, to tackle it. To ensure its universality and scalability, we design the policy neural network (NN) in HRLOrch based on a GNN. As the GNN-based policy NN can operate on the graph-structured network state of an EO-DCI directly, it can adapt to an arbitrary EO-DCI topology without any structural changes. Then, through analysis, we find that the EO-DCI is a sparse reward environment if we want to train a DRL model to minimize the blocking probability of vNF-SCs in it directly. To address this issue, we design a hierarchical DRL with lower-level and upper-level models to improve the convergence performance of training. Specifically, we make the lower-level DRL optimize the provisioning scheme of each vNF-SC to minimize its resource usage, while the upper-level one coordinates the provisioning of all the active vNF-SCs to minimize the overall blocking probability. Hence, the lower-level and upper-level DRL models operate cooperatively in the training to optimize the dynamic provisioning of vNF-SCs. Our simulations demonstrate the universality and scalability of HRLOrch, and confirm that it can outperform the existing algorithms for vNF-SC provisioning in an EO-DCI.

引用

页码：935 / 946

页数：12

共 52 条

[1]

[Anonymous], 2013, Network functions virtualisation (NFV)

[2]

[Anonymous], 2017, Cisco visual networking index: Global mobile data traffic forecast update, 2016-2021 white paper

[3]

[Anonymous], 2014, NETWORK FUNCTIONS VI

[4]

[Anonymous], 2016, 2016 IEEE GLOBAL COM, DOI DOI 10.1109/GLOCOM.2016.7841533

[5]

Askari L, 2018, 22ND INTERNATIONAL CONFERENCE ON OPTICAL NETWORK DESIGN AND MODELING (ONDM 2018), P136, DOI 10.23919/ONDM.2018.8396120

[6]

Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473

[7] Modeling Internet backbone traffic at the flow level [J].

Barakat, C ;

Thiran, P ;

Iannaccone, G ;

Diot, C ;

Owezarski, P .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2003, 51 (08) :2111-2124

[8]

Bresson X., 2019, 2 STEP GRAPH CONVOLU

[9] DeepRMSA: A Deep Reinforcement Learning Framework for Routing, Modulation and Spectrum Assignment in Elastic Optical Networks [J].

Chen, Xiaoliang ;

Li, Baojia ;

Proietti, Roberto ;

Lu, Hongbo ;

Zhu, Zuqing ;

Yoo, S. J. Ben .

JOURNAL OF LIGHTWAVE TECHNOLOGY, 2019, 37 (16) :4155-4163

[10] Self-Taught Anomaly Detection With Hybrid Unsupervised/Supervised Machine Learning in Optical Networks [J].

Chen, Xiaoliang ;

Li, Baojia ;

Proietti, Roberto ;

Zhu, Zuqing ;

Ben Yoo, S. J. .

JOURNAL OF LIGHTWAVE TECHNOLOGY, 2019, 37 (07) :1742-1749

← 1 2 3 4 5 6 →