Learnable Pooling Methods for Video Classification

被引：2

作者：

Kmiec, Sebastian ^{[1
]}

Bae, Juhan ^{[1
]}

An, Ruijian ^{[1
]}

机构：

[1] Univ Toronto, Toronto, ON, Canada

来源：

COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV | 2019年 / 11132卷

关键词：

Video classification; Youtube-8M; NetVLAD; Attention; Pooling; Aggregation;

D O I：

10.1007/978-3-030-11018-5_21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce modifications to state-of-the-art approaches to aggregating local video descriptors by using attention mechanisms and function approximations. Rather than using ensembles of existing architectures, we provide an insight on creating new architectures. We demonstrate our solutions in the "The 2nd YouTube-8M Video Understanding Challenge", by using frame-level video and audio descriptors. We obtain testing accuracy similar to the state of the art, while meeting budget constraints, and touch upon strategies to improve the state of the art. Model implementations are available in https://github.com/pomonam/LearnablePoolingMethods.

引用

页码：229 / 238

页数：10

共 20 条

[1] Abu-El-Haija S., 2016, ARXIV160908675
[2] [Anonymous], 2017, CORR
[3] [Anonymous], 2017, ARXIV170803805
[4] Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/CVPR.2016.572, 10.1109/TPAMI.2017.2711011]
[5] Brock Andre, 2016, CORR
[6] Girdhar R., ACTIONVLAD LEARNING
[7] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[8] Aggregating Local Image Descriptors into Compact Codes
Jegou, Herve
Perronnin, Florent
Douze, Matthijs
Sanchez, Jorge
Perez, Patrick
Schmid, Cordelia
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (09) : 1704 - 1716
[9] Triangulation embedding and democratic aggregation for image search
Jegou, Herve
Zisserman, Andrew
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3310 - 3317
[10] HIERARCHICAL MIXTURES OF EXPERTS AND THE EM ALGORITHM
JORDAN, MI
JACOBS, RA
[J]. NEURAL COMPUTATION, 1994, 6 (02) : 181 - 214

← 1 2 →