共 43 条
[21]
A Real-Time Cross-modality Correlation Filtering Method for Referring Expression Comprehension
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:10877-10886
[22]
RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:5168-5177
[23]
Microsoft COCO: Common Objects in Context
[J].
COMPUTER VISION - ECCV 2014, PT V,
2014, 8693
:740-755
[24]
Recurrent Multimodal Interaction for Referring Image Segmentation
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:1280-1289
[25]
Learning to Assemble Neural Module Tree Networks for Visual Grounding
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:4672-4681
[26]
Long J, 2015, PROC CVPR IEEE, P3431, DOI 10.1109/CVPR.2015.7298965
[27]
Generation and Comprehension of Unambiguous Object Descriptions
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:11-20
[28]
Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries
[J].
COMPUTER VISION - ECCV 2018, PT XI,
2018, 11215
:656-672
[29]
Kipf TN, 2017, Arxiv, DOI [arXiv:1609.02907, 10.48550/arXiv.1609.02907]
[30]
Pennington J, 2014, P 2014 C EMP METH NA, DOI [DOI 10.3115/V1/D14-1162, 10.3115/v1/D14-1162, 10.3115/v1/d14-1162]