共 21 条
Better Learning Shot Boundary Detection via Multi-task
被引:2
作者:
Zhang, Haoxin
[1
]
Li, Zhimin
[2
]
Lu, Qinglin
[1
]
机构:
[1] Tencent Data Platform, Shenzhen, Peoples R China
[2] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
来源:
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021
|
2021年
关键词:
Shot boundary detection;
Spatio-temporal attention;
Multi-task;
learning;
Dynamic loss;
D O I:
10.1145/3474085.3479206
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Shot boundary detection (SBD) plays an important role in video understanding, since most recent works take the shot as minimal granularity instead of frames for upstream tasks. However, the large variations of hard-cut and gradual-change transitions within shots significantly limit the performance of SBD. To deal with the variations, we propose a multi-task architecture called Transnet++. Transnet++ disentangles the two types of transition and adopts two separate branches to predict them respectively. Two branches share the same video knowledge space and their results are fused for final prediction. Moreover, we propose a spatial attention module (SAM) to enhance the feature representations which suffers from redundant padding region. Meanwhile, a temporal attention module (TAM) is applied to capture the long-term information of the video for alleviating the over-segmentation problem. Experimental results (91.16%.. 1-score) on Tencent AVS Dataset demonstrate the effectiveness and superiority of Transnet++ for SBD.
引用
收藏
页码:4730 / 4734
页数:5
相关论文