共 48 条
[1]
Abnar S, 2020, Arxiv, DOI arXiv:2006.00555
[2]
Anandkumar A, 2017, Arxiv, DOI arXiv:1610.09322
[3]
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[4]
Attention Augmented Convolutional Networks
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:3285-3294
[5]
Carion N, 2020, Arxiv, DOI arXiv:2005.12872
[6]
UNITER: UNiversal Image-TExt Representation Learning
[J].
COMPUTER VISION - ECCV 2020, PT XXX,
2020, 12375
:104-120
[7]
Chen YP, 2018, Arxiv, DOI arXiv:1810.11579
[8]
Cordonnier JB, 2020, Arxiv, DOI arXiv:1911.03584
[9]
dAscoli S., 2019, ADV NEURAL INFORM PR, P9334
[10]
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arXiv.1810.04805]