共 47 条
[1]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[2]
[Anonymous], 2011, CNLL
[3]
[Anonymous], 2013, P 2013 C EMP METH NA
[4]
The Unreasonable Effectiveness of CLIP Features for Image Captioning: An Experimental Analysis
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022,
2022,
:4661-4669
[5]
Cho KYHY, 2014, Arxiv, DOI arXiv:1406.1078
[6]
Meshed-Memory Transformer for Image Captioning
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:10575-10584
[7]
Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:10687-10696
[8]
Dhir R, 2019, COMPUT SIST, V23, P693, DOI [10.13053/cys-23-3-3269, 10.13053/CyS-23-3-3269]
[9]
Every Picture Tells a Story: Generating Sentences from Images
[J].
COMPUTER VISION-ECCV 2010, PT IV,
2010, 6314
:15-+
[10]
DeeCap: Dynamic Early Exiting for Efficient Image Captioning
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2022,
:12206-12216