共 35 条
[1]
Achour E, 2022, J MACH LEARN RES, V23
[2]
Allen-Zhu Z., 2019, P ADV NEUR INF PROC, P6676
[3]
Arjovsky M, 2016, PR MACH LEARN RES, V48
[4]
Arora S, 2019, Arxiv, DOI [arXiv:1810.02281, DOI 10.48550/ARXIV.1810.02281]
[5]
Bansal N., 2018, Advances in Neural Information Processing Systems, P4261
[6]
Bartlett Peter, 2018, INT C MACHINE LEARNI, P521
[7]
LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT
[J].
IEEE TRANSACTIONS ON NEURAL NETWORKS,
1994, 5 (02)
:157-166
[8]
Chatterjee S, 2022, Arxiv, DOI arXiv:2203.16462
[9]
Cisse M, 2017, PR MACH LEARN RES, V70
[10]
Cogswell M, 2016, Arxiv, DOI arXiv:1511.06068