Reliable adaptive edge-cloud collaborative DNN inference acceleration scheme combining computing and communication resources in optical networks

被引：1

作者：

Yin, Shan ^{[1
]}

Jiao, Yurong ^{[1
]}

You, Chenyu ^{[1
]}

Cai, Mengru ^{[1
]}

Jin, Tianyu ^{[1
]}

Huang, Shanguo ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, State Key Lab Informat Photon & Opt Commun, Beijing 100876, Peoples R China

来源：

JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING | 2023年 / 15卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Task analysis; Servers; Collaboration; Optical fiber networks; Computational modeling; Reliability; Cloud computing; ALLOCATION; MULTIUSER; SPECTRUM; INTERNET; THINGS;

D O I：

10.1364/JOCN.495765

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the continuous development of the Artificial Intelligence of Things, deep neural network (DNN) models require a larger amount of computing capacity. The emerging edge-cloud collaboration architecture in optical networks is proposed as an effective solution, which combines edge computing with cloud computing to provide faster response and reduce the cloud load for compute-intensive tasks. The multi-layered DNN model can be divided into subtasks that are offloaded to edge and cloud servers for computation in this architecture. In addition, as bearer networks for computing capacity, once a server or link in optical networks fails, a large amount of data can be lost, so the robust reliability of the edge-cloud collaborative optical networks is very important. To solve the above problems, we design a reliable adaptive edge-cloud collaborative DNN inference acceleration scheme (RACAI) combining computing and communication resources. We formulate the RACAI into a mixed integer linear programming model and develop a multi-agent deep reinforcement learning algorithm (MADRL-RACIA) to jointly optimize DNN task partitioning, offloading, and protection. The simulation results show that compared with the benchmark schemes, the proposed MADRL-RACIA can provide a guarantee of reliability for more tasks under latency constraints and reduce the blocking probability.

引用

页码：750 / 764

页数：15