共 72 条
- [1] Anderson P, 2018, Arxiv, DOI [arXiv:1807.06757, 10.48550/ARXIV.1807.06757]
- [2] Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3674 - 3683
- [3] [Anonymous], 2014, P INT C NEUR INF PRO, DOI DOI 10.48550/ARXIV.1412.3555
- [4] Batra D, 2020, Arxiv, DOI [arXiv:2006.13171, DOI 10.48550/ARXIV.2006.13171]
- [5] Brown TB, 2020, ADV NEUR IN, V33
- [7] Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16516 - 16526
- [9] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2432 - 2443
- [10] Dai ZH, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P2978