Multiple Temporal Fusion based Weakly-supervised Pre-training Techniques for Video Categorization

被引：1

作者：

Cai, Xiaochen ^{[1
]}

Cai, Hengxing ^{[1
]}

Zhu, Boqing ^{[2
]}

Xu, Kele ^{[2
]}

Tu, Weiwei ^{[1
]}

Feng, Dawei ^{[2
]}

机构：

[1] 4Paradigm Inc, Beijing, Peoples R China

[2] Natl Univ Def Technol, Changsha, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年

关键词：

Representation Learning; Video Understanding; Pre-training; Video Transformer; Temporal Resolution;

D O I：

10.1145/3503161.3551585

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In this paper, we present our solution of the ACM Multimedia 2022 pre-training for video understanding challenge. First, we pre-train the models on large-scale weakly-supervised video datasets with different temporal resolutions, then fine-tune the model for downstream application. Quantitative comparisons are conducted to evaluate the performance of different networks at multiple temporal resolutions. Moreover, we fusion different pre-trained models through weighted averaging. We achieve an accuracy of 62.39% in the testing set, which ranked as the first place in the video categorization track of this challenge.

引用

页码：7089 / 7093

页数：5

共 24 条

[1] PoseTrack: A Benchmark for Human Pose Estimation and Tracking [J].

Andriluka, Mykhaylo ;

Iqbal, Umar ;

Insafutdinov, Eldar ;

Pishchulin, Leonid ;

Milan, Anton ;

Gall, Juergen ;

Schiele, Bernt .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5167-5176

[2]

[Anonymous], 2016, P IEEE C COMP VIS PA

[3]

[Anonymous], 2016, What makes imagenet good for transfer learning?

[4]

Bao Hangbo, 2021, PROC INT C LEARN REP

[5]

Bertasius G, 2021, PR MACH LEARN RES, V139

[6] Random forests [J].

Breiman, L .

MACHINE LEARNING, 2001, 45 (01) :5-32

[7] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].

Cao, Zhe ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310

[8]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[9]

Devlin J., 2018, P C N AM CHAPT ASS C, P1

[10]

Donahue J, 2014, PR MACH LEARN RES, V32

← 1 2 3 →