Dual adaptive training of photonic neural networks

被引:39
作者
Zheng, Ziyang [1 ,2 ]
Duan, Zhengyang [1 ]
Chen, Hang [1 ]
Yang, Rui [2 ]
Gao, Sheng [1 ]
Zhang, Haiou [1 ]
Xiong, Hongkai [2 ]
Lin, Xing [1 ,3 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai, Peoples R China
[3] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
CLASSIFICATION;
D O I
10.1038/s42256-023-00723-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Photonic neural networks (PNNs) are remarkable analogue artificial intelligence accelerators that compute using photons instead of electrons at low latency, high energy efficiency and high parallelism; however, the existing training approaches cannot address the extensive accumulation of systematic errors in large-scale PNNs, resulting in a considerable decrease in model performance in physical systems. Here we propose dual adaptive training (DAT), which allows the PNN model to adapt to substantial systematic errors and preserves its performance during deployment. By introducing the systematic error prediction networks with task-similarity joint optimization, DAT achieves high similarity mapping between the PNN numerical models and physical systems, as well as highly accurate gradient calculations during dual backpropagation training. We validated the effectiveness of DAT by using diffractive and interference-based PNNs on image classification tasks. Dual adaptive training successfully trained large-scale PNNs under major systematic errors and achieved high classification accuracies. The numerical and experimental results further demonstrated its superior performance over the state-of-the-art in situ training approaches. Dual adaptive training provides critical support for constructing large-scale PNNs to achieve advanced architectures and can be generalized to other types of artificial intelligence systems with analogue computing errors. Despite their efficiency advantages, the performance of photonic neural networks is hampered by the accumulation of inherent systematic errors. Zheng et al. propose a dual backpropagation training approach, which allows the network to adapt to systematic errors, thus outperforming state-of-the-art in situ training approaches.
引用
收藏
页码:1119 / +
页数:21
相关论文
共 43 条
[1]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[2]   Tutorial: Photonic neural networks in delay systems [J].
Brunner, D. ;
Penkovsky, B. ;
Marquez, B. A. ;
Jacquot, M. ;
Fischer, I. ;
Larger, L. .
JOURNAL OF APPLIED PHYSICS, 2018, 124 (15)
[3]   Reinforcement learning in a large-scale photonic recurrent neural network [J].
Bueno, J. ;
Maktoobi, S. ;
Froehly, L. ;
Fischer, I. ;
Jacquot, M. ;
Larger, L. ;
Brunner, D. .
OPTICA, 2018, 5 (06) :756-760
[4]   DNA methylation-based classification of central nervous system tumours [J].
Capper, David ;
Jones, David T. W. ;
Sill, Martin ;
Hovestadt, Volker ;
Schrimpf, Daniel ;
Sturm, Dominik ;
Koelsche, Christian ;
Sahm, Felix ;
Chavez, Lukas ;
Reuss, David E. ;
Kratz, Annekathrin ;
Wefers, Annika K. ;
Huang, Kristin ;
Pajtler, Kristian W. ;
Schweizer, Leonille ;
Stichel, Damian ;
Olar, Adriana ;
Engel, Nils W. ;
Lindenberg, Kerstin ;
Harter, Patrick N. ;
Braczynski, Anne K. ;
Plate, Karl H. ;
Dohmen, Hildegard ;
Garvalov, Boyan K. ;
Coras, Roland ;
Hoelsken, Annett ;
Hewer, Ekkehard ;
Bewerunge-Hudler, Melanie ;
Schick, Matthias ;
Fischer, Roger ;
Beschorner, Rudi ;
Schittenhelm, Jens ;
Staszewski, Ori ;
Wani, Khalida ;
Varlet, Pascale ;
Pages, Melanie ;
Temming, Petra ;
Lohmann, Dietmar ;
Selt, Florian ;
Witt, Hendrik ;
Milde, Till ;
Witt, Olaf ;
Aronica, Eleonora ;
Giangaspero, Felice ;
Rushing, Elisabeth ;
Scheurlen, Wolfram ;
Geisenberger, Christoph ;
Rodriguez, Fausto J. ;
Becker, Albert ;
Preusser, Matthias .
NATURE, 2018, 555 (7697) :469-+
[5]   Photonic In-Memory Computing Primitive for Spiking Neural Networks Using Phase-Change Materials [J].
Chakraborty, Indranil ;
Saha, Gobinda ;
Roy, Kaushik .
PHYSICAL REVIEW APPLIED, 2019, 11 (01)
[6]   Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification [J].
Chang, Julie ;
Sitzmann, Vincent ;
Dun, Xiong ;
Heidrich, Wolfgang ;
Wetzstein, Gordon .
SCIENTIFIC REPORTS, 2018, 8
[7]   Optimal design for universal multiport interferometers [J].
Clements, William R. ;
Humphreys, Peter C. ;
Metcalf, Benjamin J. ;
Kolthammer, W. Steven ;
Walmsley, Ian A. .
OPTICA, 2016, 3 (12) :1460-1465
[8]   Parallel convolutional processing using an integrated photonic tensor core [J].
Feldmann, J. ;
Youngblood, N. ;
Karpov, M. ;
Gehring, H. ;
Li, X. ;
Stappers, M. ;
Le Gallo, M. ;
Fu, X. ;
Lukashchuk, A. ;
Raja, A. S. ;
Liu, J. ;
Wright, C. D. ;
Sebastian, A. ;
Kippenberg, T. J. ;
Pernice, W. H. P. ;
Bhaskaran, H. .
NATURE, 2021, 589 (7840) :52-+
[9]   All-optical spiking neurosynaptic networks with self-learning capabilities [J].
Feldmann, J. ;
Youngblood, N. ;
Wright, C. D. ;
Bhaskaran, H. ;
Pernice, W. H. P. .
NATURE, 2019, 569 (7755) :208-+
[10]   Silicon photonic architecture for training deep neural networks with direct feedback alignment [J].
Filipovich, Atthew J. ;
Guo, Zhimu ;
Al-Qadasi, Mohammed ;
Arquez, Bicky A. M. ;
Morison, Hugh D. ;
Sorger, Volker J. ;
Prucnal, Paul R. ;
Shekhar, Sudip ;
Shastri, Bhavin J. .
OPTICA, 2022, 9 (12) :1323-1332