共 72 条
- [1] Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12479 - 12488
- [2] Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3674 - 3683
- [3] [Anonymous], 2015, arXiv:1504.00325
- [4] [Anonymous], ICPR
- [5] [Anonymous], ICIP
- [6] [Anonymous], ICCV
- [7] [Anonymous], 2017, IJCAI
- [8] Aytar Y, 2016, ADV NEUR IN, V29
- [9] Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, 10.48550/arXiv.1409.0473, DOI 10.48550/ARXIV.1409.0473]
- [10] Bengio S.., 2015, Advances in Neural Information Processing Systems