共 83 条
[51]
Romero A., 2014, FITNETS HINTS THIN D
[52]
Soomro Khurram, 2012, CRCVTR1201
[53]
Sun Chen, 2019, Learning video representations using contrastive bidirectional transformer
[54]
Tian Yonglong, 2019, P EUR C COMP VIS
[55]
Tschannen M., 2020, P 8 INT C LEARN REPR
[56]
van den Oord Aaron<spacing, 2018, ARXIV
[57]
Villegas Ruben, 2017, P INT C LEARN REPR
[58]
Tracking Emerges by Colorizing Videos
[J].
COMPUTER VISION - ECCV 2018, PT XIII,
2018, 11217
:402-419
[59]
Anticipating Visual Representations from Unlabeled Video
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:98-106
[60]
Reconstruction Network for Video Captioning
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:7622-7631