共 80 条
- [2] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6077 - 6086
- [3] SPICE: Semantic Propositional Image Caption Evaluation [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 382 - 398
- [4] Convolutional Image Captioning [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5561 - 5570
- [7] Cornia M, 2020, PROC CVPR IEEE, P10575, DOI 10.1109/CVPR42600.2020.01059
- [8] Denkowski M., 2014, Proceedings of the ninth workshop on statistical machine translation, P376, DOI DOI 10.3115/V1/W14-3348
- [9] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
- [10] Dosovitskiy A., 2021, INT C LEARNING REPRE