Fast and scalable all-optical network architecture for distributed deep learning

被引:0
|
作者
Li, Wenzhe [1 ]
Yuan, Guojun [1 ]
Wang, Zhan [1 ]
Tan, Guangming [1 ]
Zhang, Peiheng [1 ,2 ]
Rouskas, George N. [3 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, 6 Kexueyuan South Rd Zhongguancun, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Intelligent Comp Technol, 88 Jinji Lake Ave,Ind Pk, Suzhou, Peoples R China
[3] North Carolina State Univ, Dept Comp Sci, 890 Oval Dr, Raleigh, NC 27695 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
PERFORMANCE; OPERATIONS;
D O I
10.1364/JOCN.511696
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the ever-increasing size of training models and datasets, network communication has emerged as a major bottleneck in distributed deep learning training. To address this challenge, we propose an optical distributed deep learning (ODDL) architecture. ODDL utilizes a fast yet scalable all-optical network architecture to accelerate distributed training. One of the key features of the architecture is its flow-based transmit scheduling with fast reconfiguration. This allows ODDL to allocate dedicated optical paths for each traffic stream dynamically, resulting in low network latency and high network utilization. Additionally, ODDL provides physically isolated and tailored network resources for training tasks by reconfiguring the optical switch using LCoS-WSS technology. The ODDL topology also uses tunable transceivers to adapt to time-varying traffic patterns. To achieve accurate and fine-grained scheduling of optical circuits, we propose an efficient distributed control scheme that incurs minimal delay overhead. Our evaluation on real-world traces showcases ODDL's remarkable performance. When implemented with 1024 nodes and 100 Gbps bandwidth, ODDL accelerates VGG19 training by 1.6x and 1.7x compared to conventional fat-tree electrical networks and photonic SiP-Ring architectures, respectively. We further build a four-node testbed, and our experiments show that ODDL can achieve comparable training time compared to that of an ideal electrical switching network. (c) 2024 Optica Publishing Group
引用
收藏
页码:342 / 357
页数:16
相关论文
共 50 条
  • [21] All-optical IP network
    Lembre, Per
    Telephony, 2000, 238 (19)
  • [22] Toward an all-optical network
    Telephony, 8 (09):
  • [23] Planning for the all-optical network
    Krause, Tim
    Telephony, 1997, 232 (16)
  • [24] OXC in all-optical network
    Pan, Zhiwen
    You, Xiaohu
    Nanjing Youdian Xueyuan Xuebao/Journal of Nanjing Institute of Posts and Telecommunications, 2000, 20 (01): : 86 - 90
  • [25] Programmable Fast All-Optical Thresholder
    Jha, Aashu
    Huang, Chaoran
    de Lima, Thomas Ferreira
    Prucnal, Paul R.
    2020 CONFERENCE ON LASERS AND ELECTRO-OPTICS (CLEO), 2020,
  • [26] Scalable Data Center Network Architecture With Distributed Placement of Optical Switches and Racks
    Xiao, Jie
    Wu, Bin
    Jiang, Xiaohong
    Pattavina, Achille
    Wen, Hong
    Zhang, Lei
    JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING, 2014, 6 (03) : 270 - 281
  • [27] Fast and Scalable Network Slicing by Integrating Deep Learning with Lagrangian Methods
    Hu, Tianlun
    Liao, Qi
    Liu, Qiang
    Massaro, Antonio
    Carle, Georg
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 6346 - 6351
  • [28] Deep learning for isolated attosecond pulse reconstruction with the all-optical method
    Meng, Lihui
    Liang, Shiqi
    He, Lixin
    Hu, Jianchang
    Sun, Siqi
    Lan, Pengfei
    Lu, Peixiang
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA B-OPTICAL PHYSICS, 2023, 40 (10) : 2536 - 2545
  • [29] All-optical machine learning using diffractive deep neural networks
    Lin, Xing
    Rivenson, Yair
    Yardimei, Nezih T.
    Veli, Muhammed
    Luo, Yi
    Jarrahi, Mona
    Ozcan, Aydogan
    SCIENCE, 2018, 361 (6406) : 1004 - +
  • [30] Scalability and Performance of a Distributed AWGR-based All-Optical Token Interconnect Architecture
    Proietti, Roberto
    Nitta, Christopher J.
    Yin, Yawei
    Akella, Venkatesh
    Yoo, S. J. B.
    2013 OPTICAL FIBER COMMUNICATION CONFERENCE AND EXPOSITION AND THE NATIONAL FIBER OPTIC ENGINEERS CONFERENCE (OFC/NFOEC), 2013,