Better Learning Shot Boundary Detection via Multi-task

被引:2
作者
Zhang, Haoxin [1 ]
Li, Zhimin [2 ]
Lu, Qinglin [1 ]
机构
[1] Tencent Data Platform, Shenzhen, Peoples R China
[2] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
关键词
Shot boundary detection; Spatio-temporal attention; Multi-task; learning; Dynamic loss;
D O I
10.1145/3474085.3479206
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Shot boundary detection (SBD) plays an important role in video understanding, since most recent works take the shot as minimal granularity instead of frames for upstream tasks. However, the large variations of hard-cut and gradual-change transitions within shots significantly limit the performance of SBD. To deal with the variations, we propose a multi-task architecture called Transnet++. Transnet++ disentangles the two types of transition and adopts two separate branches to predict them respectively. Two branches share the same video knowledge space and their results are fused for final prediction. Moreover, we propose a spatial attention module (SAM) to enhance the feature representations which suffers from redundant padding region. Meanwhile, a temporal attention module (TAM) is applied to capture the long-term information of the video for alleviating the over-segmentation problem. Experimental results (91.16%.. 1-score) on Tencent AVS Dataset demonstrate the effectiveness and superiority of Transnet++ for SBD.
引用
收藏
页码:4730 / 4734
页数:5
相关论文
共 21 条
[1]  
[Anonymous], 2017, TREC Video Retrieval Evaluation
[2]  
[Anonymous], 2015, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[3]  
Chen Shixing, 2021, ABS210413537 CORR
[4]  
Ghauri Junaid Ahmed, 2021, 2021 IEEE International Conference on Multimedia and Expo (ICME), P1, DOI 10.1109/ICME51207.2021.9428318
[5]  
Gruzman Igor S., 2014, 2014 12th International Conference on Actual Problems of Electronics Instrument Engineering (APEIE), DOI 10.1109/APEIE.2014.7040826
[6]  
Gygli M., 2018, 2018 INT C CONT BAS, P1
[7]  
Hassanien A., 2017, Large-scale, fast and accurate shot boundary detection through spatio-temporal convolutional neural networks
[8]  
Helm Daniel, 2019, New Trends in Image Analysis and Processing - ICIAP 2019. ICIAP International Workshops BioFor, PatReCH, e-BADLE, DeepRetail, and Industrial Session. Revised Selected Papers: Lecture Notes in Computer Science (LNCS 11808), P137, DOI 10.1007/978-3-030-30754-7_14
[9]   A Survey on Visual Content-Based Video Indexing and Retrieval [J].
Hu, Weiming ;
Xie, Nianhua ;
Li, Li ;
Zeng, Xianglin ;
Maybank, Stephen .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2011, 41 (06) :797-819
[10]   Shot Boundary Detection based on Multilevel Difference of Colour Histograms [J].
Li, ZongJie ;
Liu, Xiabi ;
Zhang, Shuwen .
PROCEEDINGS 2016 FIRST INTERNATIONAL CONFERENCE ON MULTIMEDIA AND IMAGE PROCESSING (ICMIP 2016), 2016, :15-22