共 90 条
[1]
nocaps: novel object captioning at scale
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:8947-8956
[2]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[3]
SPICE: Semantic Propositional Image Caption Evaluation
[J].
COMPUTER VISION - ECCV 2016, PT V,
2016, 9909
:382-398
[5]
Brown TB, 2020, ADV NEUR IN, V33
[7]
End-to-End Object Detection with Transformers
[J].
COMPUTER VISION - ECCV 2020, PT I,
2020, 12346
:213-229
[8]
Chen X, 2022, Arxiv, DOI arXiv:2209.06794
[9]
Chung HW, 2022, Arxiv, DOI [arXiv:2210.11416, 10.48550/arXiv.2210.11416]