共 41 条
[1]
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:3674-3683
[2]
SPICE: Semantic Propositional Image Caption Evaluation
[J].
COMPUTER VISION - ECCV 2016, PT V,
2016, 9909
:382-398
[3]
[Anonymous], 2019, ENGLISH SPEAKING WOR
[4]
Artetxe Mikel., 2018, ICLR, DOI [DOI 10.18653/V1/D18-1399, 10.18653/v1/D18-1399]
[5]
Banerjee S., 2005, P ACL WORKSH INTR EX, P65
[6]
Ding H., 2018, PROC CVPR IEEE, P2393, DOI DOI 10.1109/CVPR.2018.00254
[7]
Fang H, 2015, PROC CVPR IEEE, P1473, DOI 10.1109/CVPR.2015.7298754
[8]
Spatio-temporal Video Re-localization by Warp LSTM
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:1288-1297
[9]
Goodfellow I., 2014, NeurIPS, V27, P1
[10]
Gu J., 2017, AAAI