共 50 条
[2]
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:9687-9695
[3]
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:3674-3683
[4]
Chen L, 2021, AAAI CONF ARTIF INTE, V35, P1036
[5]
Counterfactual Samples Synthesizing for Robust Visual Question Answering
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:10797-10806
[6]
Selective Comprehension for Referring Expression by Prebuilt Entity Dictionary with Modular Networks
[J].
KNOWLEDGE MANAGEMENT AND ACQUISITION FOR INTELLIGENT SYSTEMS (PKAW 2018),
2018, 11016
:211-220
[7]
TransVG: End-to-End Visual Grounding with Transformers
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:1749-1759
[8]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]
Du Yunhao., 2022, 2022 IEEE INT C MULT, P1, DOI [DOI 10.1109/ICME52920.2022.9859880, 10.1109/ICME52920.2022.9859880]
[10]
Modularized Textual Grounding for Counterfactual Resilience
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:6371-6381