Machine Learning for Robust Network Design: A New Perspective

被引:2
作者
Liu, Chenyi [1 ]
Aggarwal, Vaneet [2 ]
Lan, Tian [3 ]
Geng, Nan [4 ]
Yang, Yuan [4 ]
Xu, Mingwei [4 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
[2] Purdue Univ, W Lafayette, IN 47907 USA
[3] George Washington Univ, Washington, DC 20052 USA
[4] Tsinghua Univ, Beijing, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Network topology; Topology; Routing; Planning; Optimization; Prediction algorithms; Machine learning;
D O I
10.1109/MCOM.002.2200670
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the rapid growth of backbone networks and data center networks, ensuring network robustness under various failure scenarios has become a key challenge in network design. The combinatorial nature of failure scenarios in data plane, control plane, and management plane seriously challenges existing practice on robust network design, which often requires verifying the designed network's performance by enumerating all possible failure combinations. Meanwhile, machine learning (ML) has been applied to many networking problems and has shown tremendous success. In this article, we show a general approach to leveraging machine learning to support robust network design. First, we give a selective overview of current work on robust network design and show that failure evaluation provides a common kernel to improve the tractability and scalability of existing solutions. Then we propose a function approximation of the common kernel based on graph attention network (GAT) to efficiently evaluate the impact of various potential failure scenarios and identify critical failures that may have significant consequences. The function approximation allows us to obtain new models of three important robust network design problems and to solve them efficiently by evaluating the solutions against a pruned set of critical failures. We evaluate our approach in the three use cases and demonstrate significant reduction in time-to-solution with minimum performance gap. Finally, we discuss how the proposed framework can be applied to many other robust network design problems.
引用
收藏
页码:86 / 92
页数:7
相关论文
共 15 条
[1]   Capacity-Efficient and Uncertainty-Resilient Backbone Network Planning with Hose [J].
Ahuja, Satyajeet Singh ;
Gupta, Varun ;
Dangui, Vinayak ;
Bali, Soshant ;
Gopalan, Abishek ;
Zhong, Hao ;
Lapukhov, Petr ;
Xia, Yiting ;
Zhang, Ying .
SIGCOMM '21: PROCEEDINGS OF THE 2021 ACM SIGCOMM 2021 CONFERENCE, 2021, :547-559
[2]   TEAVAR: Striking the Right Utilization-Availability Balance in WAN Traffic Engineering [J].
Bogle, Jeremy ;
Bhatia, Nikhil ;
Ghobadi, Manya ;
Menache, Ishai ;
Bjorner, Nikolaj ;
Valadarsky, Asaf ;
Schapira, Michael .
SIGCOMM '19 - PROCEEDINGS OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION, 2019, :29-43
[3]  
Brody S, 2021, ARXIV
[4]   Lancet: Better network resilience by designing for pruned failure sets [J].
Chang, Yiyang ;
Jiang, Chuan ;
Chandra, Ashish ;
Rao, Sanjay ;
Tawarmalani, Mohit .
PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2019, 3 (03)
[5]  
Chang YY, 2017, PROCEEDINGS OF NSDI '17: 14TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, P347
[6]   RouteNet-Erlang: A Graph Neural Network for Network Performance Evaluation [J].
Ferriol-Galmes, Miquel ;
Rusek, Krzysztof ;
Suarez-Varela, Jose ;
Xiao, Shihan ;
Shi, Xiang ;
Cheng, Xiangle ;
Wu, Bo ;
Barlet-Ros, Pere ;
Cabellos-Aparicio, Albert .
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, :2018-2027
[7]   Evolve or Die: High-Availability Design Principles Drawn from Google's Network Infrastructure [J].
Govindan, Ramesh ;
Minei, Ina ;
Kallahalla, Mahesh ;
Koley, Bikash ;
Vahdat, Amin .
PROCEEDINGS OF THE 2016 ACM CONFERENCE ON SPECIAL INTEREST GROUP ON DATA COMMUNICATION (SIGCOMM '16), 2016, :58-72
[8]   PCF: Provably Resilient Flexible Routing [J].
Jiang, Chuan ;
Rao, Sanjay ;
Tawarmalani, Mohit .
SIGCOMM '20: PROCEEDINGS OF THE 2020 ANNUAL CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION ON THE APPLICATIONS, TECHNOLOGIES, ARCHITECTURES, AND PROTOCOLS FOR COMPUTER COMMUNICATION, 2020, :139-153
[9]  
Krishnaswamy U, 2022, PROCEEDINGS OF THE 19TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION (NSDI '22), P325
[10]  
Liu C., ARXIV