共 22 条
[1]
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:4724-4733
[2]
Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:8421-8431
[3]
Duan X, 2018, ADV NEUR IN, V31
[4]
TALL: Temporal Activity Localization via Language Query
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:5277-5285
[5]
Gao M., 2019, arXiv
[6]
Cross-Sentence Temporal and Semantic Relations in Video Activity Localisation
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:7179-7188
[7]
Dense-Captioning Events in Videos
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:706-715
[8]
Li K, 2021, AAAI CONF ARTIF INTE, V35, P1902
[9]
Lin ZJ, 2020, AAAI CONF ARTIF INTE, V34, P11539