共 56 条
[42]
Vaswani A, 2017, ADV NEUR IN, V30
[43]
Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:1960-1968
[44]
Non-local Neural Networks
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:7794-7803
[45]
Wang Z., 2022, CVPR, P11686
[46]
Xie EZ, 2021, ADV NEUR IN, V34
[47]
Bottom-Up Shift and Reasoning for Referring Image Segmentation
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:11261-11270
[48]
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:18134-18144
[49]
Improving One-Stage Visual Grounding by Recursive Sub-query Construction
[J].
COMPUTER VISION - ECCV 2020, PT XIV,
2020, 12359
:387-404
[50]
A Fast and Accurate One-Stage Approach to Visual Grounding
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:4682-4692