共 47 条
- [1] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6077 - 6086
- [2] The Unreasonable Effectiveness of CLIP Features for Image Captioning: An Experimental Analysis [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4661 - 4669
- [3] Cho KYHY, 2014, Arxiv, DOI [arXiv:1406.1078, DOI 10.48550/ARXIV.1406.1078]
- [4] Cornia M, 2020, PROC CVPR IEEE, P10575, DOI 10.1109/CVPR42600.2020.01059
- [5] Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10687 - 10696
- [6] Dhir R, 2019, COMPUT SIST, V23, P693, DOI [10.13053/CyS-23-3-3269, 10.13053/cys-23-3-3269]
- [7] Elliott D., 2013, EMNLP, P1292
- [8] Every Picture Tells a Story: Generating Sentences from Images [J]. COMPUTER VISION-ECCV 2010, PT IV, 2010, 6314 : 15 - +
- [9] DeeCap: Dynamic Early Exiting for Efficient Image Captioning [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12206 - 12216
- [10] Unsupervised Image Captioning [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4120 - 4129