共 64 条
[51]
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
[J].
COMPUTER VISION - ECCV 2016, PT VIII,
2016, 9912
:20-36
[52]
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
[J].
COMPUTER VISION - ECCV 2018, PT 15,
2018, 11219
:318-335
[53]
Discriminatively Embedded K-Means for Multi-view Clustering
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:5356-5364
[55]
Describing Videos by Exploiting Temporal Structure
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:4507-4515
[56]
Yehao L., 2021, P MM
[57]
Fine-grained Video Captioning for Sports Narrative
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6006-6015
[58]
Temporal Query Networks for Fine-grained Video Understanding
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:4484-4494
[59]
Object-aware Aggregation with Bidirectional Temporal Graph for Video Captioning
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:8319-8328
[60]
Zhang Z., 2020, P CVPR