共 58 条
[1]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[2]
SPICE: Semantic Propositional Image Caption Evaluation
[J].
COMPUTER VISION - ECCV 2016, PT V,
2016, 9909
:382-398
[3]
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[4]
Banerjee S., 2005, P ACL WORKSH INTR EX, P65
[5]
Chen F., 2021, P 2021 IEEE 30 INT S, P1, DOI DOI 10.1007/978-3-030-51812-7_172-1
[6]
GroupCap: Group-based Image Captioning with Structured Relevance and Diversity Constraints
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:1345-1353
[7]
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:6298-6306
[8]
Cho K., 2014, C EMP METH NAT LANG, DOI [10.48550/arXiv.1406.1078, DOI 10.48550/ARXIV.1406.1078]
[9]
Rethinking the Form of Latent States in Image Captioning
[J].
COMPUTER VISION - ECCV 2018, PT V,
2018, 11209
:294-310
[10]
Dauphin YN, 2017, PR MACH LEARN RES, V70