共 18 条
[1]
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:3674-3683
[2]
[Anonymous], 2016, Oxid. Med. Cell. Longev., DOI DOI 10.1155/2016/1689602
[3]
[Anonymous], 2014, P 2014 C EMPIRICAL M, DOI 10.3115/v1/D14-1082
[4]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[5]
Knowledge Aided Consistency for Weakly Supervised Phrase Grounding
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:4042-4050
[6]
Chen S., 2022, P IEEECVF C COMPUTER, P15534
[7]
TransVG: End-to-End Visual Grounding with Transformers
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:1749-1759
[8]
Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:15492-15502
[9]
Kingma DP, 2014, ADV NEUR IN, V27
[10]
Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:2611-2620