共 40 条
- [1] Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3674 - 3683
- [2] [Anonymous], 1994, P ANN M ASS COMP LIN
- [3] [Anonymous], 2014, P WORKSH INT LANG LE
- [4] A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility [J]. COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 312 - 328
- [5] WebQA: Multihop and Multimodal QA [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16474 - 16483
- [6] Chen S., 2021, Advances in Neural Information Processing Systems (NeurIPS), P5834
- [7] Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16516 - 16526
- [8] Deng X, 2023, Arxiv, DOI arXiv:2306.06070
- [9] Furuta H, 2024, Arxiv, DOI arXiv:2305.11854
- [10] Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3063 - 3072