共 53 条
- [1] Unsupervised Learning from Narrated Instruction Videos [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4575 - 4583
- [2] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6077 - 6086
- [3] [Anonymous], P ACM INT C IM VID R
- [4] [Anonymous], 2017, P IEEE INT C COMP VI
- [5] [Anonymous], 2013, T ASSOC COMPUT LING
- [6] [Anonymous], P EMNLP
- [7] [Anonymous], 2017, P 31 INT C NEURAL IN
- [8] Weakly-Supervised Alignment of Video With Text [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4462 - 4470
- [9] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
- [10] Rethinking the Faster R-CNN Architecture for Temporal Action Localization [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1130 - 1139