共 117 条
- [21] Duan Chaoqun, 2020, Multimodal matching transformer for live commenting
- [22] Every Picture Tells a Story: Generating Sentences from Images [J]. COMPUTER VISION-ECCV 2010, PT IV, 2010, 6314 : 15 - +
- [23] StyleNet: Generating Attractive Visual Captions with Styles [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 955 - 964
- [24] Self-critical n-step Training for Image Captioning [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6293 - 6301
- [25] Gong YC, 2014, LECT NOTES COMPUT SC, V8692, P529, DOI 10.1007/978-3-319-10593-2_35
- [26] MSCap: Multi-Style Image Captioning with Unpaired Stylized Text [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4199 - 4208
- [27] Deep Residual Learning for Image Recognition [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
- [28] Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
- [30] Hou RB, 2019, ADV NEUR IN, V32