共 105 条
[51]
Ku A, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P4392
[52]
Li C., 2022, C ROBOT LEARNING, P455
[53]
Li J., 2022, CVPR, P15407
[54]
Li Jialu, 2023, ARXIV230519195
[55]
Microsoft COCO: Common Objects in Context
[J].
COMPUTER VISION - ECCV 2014, PT V,
2014, 8693
:740-755
[56]
Liu C., 2021, P IEEE CVF INT C COM, P1644
[57]
Loshchilov I., 2019, P INT C LEARN REPR
[58]
Luo Haokuan, 2022, ARXIV220307359
[59]
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
[J].
COMPUTER VISION - ECCV 2020, PT VI,
2020, 12351
:259-274
[60]
Maksymets O., 2021, P IEEE CVF INT C COM, P15374