共 27 条
[1]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[2]
SPICE: Semantic Propositional Image Caption Evaluation
[J].
COMPUTER VISION - ECCV 2016, PT V,
2016, 9909
:382-398
[3]
Banerjee S, 2005, P 2 WORKSH STAT MACH, P228, DOI [10.3115/1626355.1626389, DOI 10.3115/1626355.1626389]
[4]
Chen T, arXiv
[5]
Cornia M, 2020, PROC CVPR IEEE, P10575, DOI 10.1109/CVPR42600.2020.01059
[6]
Dhariwal P, 2021, ADV NEUR IN, V34
[7]
Fei ZC, 2019, Arxiv, DOI arXiv:1912.06365
[8]
Fei ZC, 2021, AAAI CONF ARTIF INTE, V35, P1309
[10]
Ho J., 2020, Advances in Neural Information Processing Systems, V33, P6840, DOI [DOI 10.48550/ARXIV.2006.11239, 10.48550/arXiv.2006.11239]