共 117 条
- [1] Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12479 - 12488
- [3] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6077 - 6086
- [5] Banerjee Satanjeev, 2007, METEOR: An automatic metric for MT evaluation with Long short-term memory improved correlation with human judgments, P65
- [6] Bengio S, 2015, ADV NEUR IN, V28
- [7] Bowman S. R., 2015, Computer Science, V2015
- [10] Chen KZ, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, P4456