共 21 条
- [1] Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3674 - 3683
- [2] [Anonymous], 2015, arXiv:1504.00325
- [3] Bo Qu, 2016, 2016 INT C COMP INF
- [4] Chen L, 2017, IEEE C COMP VIS PATT
- [5] Gers F. A., 2001, Long short-term memory in recurrent neural networks, DOI DOI 10.5075/EPFL-THESIS-2366
- [6] Li S., 2011, P 15 C COMP NAT LANG, P220
- [7] Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3242 - 3250
- [8] Exploring Models and Data for Remote Sensing Image Caption Generation [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (04): : 2183 - 2195
- [9] Mao Junhua., 2014, Explain images with multimodal recurrent neural networks
- [10] Ordonez V., 2011, Advances in Neural Information Processing Systems, V24, P1143