共 52 条
- [1] Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12479 - 12488
- [2] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6077 - 6086
- [3] Nguyen A, 2018, IEEE INT CONF ROBOT, P3782
- [4] Banerjee S., 2005, P ACL WORKSH INTR EX, P65
- [5] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
- [6] Chen David, 2011, P 49 ANN M ASS COMP, P190
- [7] Chen M., 2018, P MACHINE LEARNING, P847
- [8] Chen SX, 2019, AAAI CONF ARTIF INTE, P8191
- [9] Video Captioning with Guidance of Multimodal Latent Topics [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1838 - 1846