共 18 条
[1]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[2]
Knowledge Aided Consistency for Weakly Supervised Phrase Grounding
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:4042-4050
[3]
Learning to Evaluate Image Captioning
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:5804-5812
[4]
Faghri Fartash, 2017, arXiv
[5]
Scene Graph Generation with External Knowledge and Image Reconstruction
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:1969-1978
[6]
Hu ZB, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P789
[7]
Karpathy A, 2015, PROC CVPR IEEE, P3128, DOI 10.1109/CVPR.2015.7298932
[8]
Kiros R, 2014, PR MACH LEARN RES, V32, P595
[9]
Stacked Cross Attention for Image-Text Matching
[J].
COMPUTER VISION - ECCV 2018, PT IV,
2018, 11208
:212-228
[10]
Microsoft COCO: Common Objects in Context
[J].
COMPUTER VISION - ECCV 2014, PT V,
2014, 8693
:740-755