The 2nd YouTube-8M Large-Scale Video Understanding Challenge

被引：9

作者：

Lee, Joonseok ^{[1
]}

Natsev, Apostol ^{[1
]}

Reade, Walter ^{[1
]}

Sukthankar, Rahul ^{[1
]}

Toderici, George ^{[1
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

来源：

COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV | 2019年 / 11132卷

关键词：

YouTube; Video Classification; Video Understanding;

D O I：

10.1007/978-3-030-11018-5_18

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We hosted the 2nd YouTube-8M Large-Scale Video Understanding Kaggle Challenge and Workshop at ECCV'18, with the task of classifying videos from frame-level and video-level audio-visual features. In this year's challenge, we restricted the final model size to 1GB or less, encouraging participants to explore representation learning or better architecture, instead of heavy ensembles of multiple models. In this paper, we briefly introduce the YouTube-8M dataset and challenge task, followed by participants statistics and result analysis. We summarize proposed ideas by participants, including architectures, temporal aggregation methods, ensembling and distillation, data augmentation, and more.

引用

页码：193 / 205

页数：13