共 76 条
[1]
Variational Information Distillation for Knowledge Transfer
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:9155-9163
[2]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[3]
SPICE: Semantic Propositional Image Caption Evaluation
[J].
COMPUTER VISION - ECCV 2016, PT V,
2016, 9909
:382-398
[4]
[Anonymous], 2020, P IEEE C COMP VIS PA, DOI DOI 10.1109/BIBM49941.2020.9313406
[5]
[Anonymous], 2019, P IEEE CVF C COMP VI
[6]
[Anonymous], 2020, ECCV, DOI DOI 10.1109/QCE49297.2020.00054
[7]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[8]
Brown TB, 2020, ADV NEUR IN, V33
[9]
Bucilua Cristian, 2006, P 12 ACM SIGKDD INT, P535
[10]
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
[J].
COMPUTER VISION - ECCV 2020, PT VI,
2020, 12351
:565-580