共 41 条
[2]
Ambrus R, 2014, IEEE INT C INT ROBOT, P1854, DOI 10.1109/IROS.2014.6942806
[3]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[4]
SPICE: Semantic Propositional Image Caption Evaluation
[J].
COMPUTER VISION - ECCV 2016, PT V,
2016, 9909
:382-398
[5]
Banerjee S., 2005, P ACL WORKSH INTR EX, P65
[6]
End-to-End Object Detection with Transformers
[J].
COMPUTER VISION - ECCV 2020, PT I,
2020, 12346
:213-229
[7]
Coppin P.R., 1996, Remote Sens. Rev, V13, P207, DOI [10.1080/02757259609532305, DOI 10.1080/02757259609532305]
[8]
Meshed-Memory Transformer for Image Captioning
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:10575-10584
[9]
Daudt RC, 2018, IEEE IMAGE PROC, P4063, DOI 10.1109/ICIP.2018.8451652
[10]
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848