共 72 条
- [51] Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1960 - 1968
- [52] Unsupervised Feature Learning via Non-Parametric Instance Discrimination [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3733 - 3742
- [53] Yang S, 2019, AAAI CONF ARTIF INTE, P5644
- [54] Weakly-Supervised Video Object Grounding by Exploring Spatio-Temporal Contexts [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1939 - 1947
- [55] A Fast and Accurate One-Stage Approach to Visual Grounding [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4682 - 4692
- [56] Yang Zhengyuan, 2020, ECCV
- [57] Yang Zhengyuan, 2020, IEEE T CIRCUITS SYST
- [59] MAttNet: Modular Attention Network for Referring Expression Comprehension [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1307 - 1315
- [60] A Joint Speaker-Listener-Reinforcer Model for Referring Expressions [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3521 - 3529