Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition

被引：174

作者：

Bagautdinov, Timur ^{[1
]}

Alahi, Alexandre ^{[2
]}

Fleuret, Francois ^{[1
,3
]}

Fua, Pascal ^{[1
]}

Savarese, Silvio ^{[2
]}

机构：

[1] Ecole Polytech Fed Lausanne, Lausanne, Switzerland

[2] Stanford Univ, Stanford, CA 94305 USA

[3] IDIAP Res Inst, Martigny, Switzerland

来源：

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) | 2017年

基金：

瑞士国家科学基金会;

关键词：

PEOPLE;

D O I：

10.1109/CVPR.2017.365

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a unified framework for understanding human social behaviors in raw image sequences. Our model jointly detects multiple individuals, infers their social actions, and estimates the collective actions with a single feed-forward pass through a neural network. We propose a single architecture that does not rely on external detection algorithms but rather is trained end-to-end to generate dense proposal maps that are refined via a novel inference scheme. The temporal consistency is handled via a person-level matching Recurrent Neural Network. The complete model takes as input a sequence of frames and outputs detections along with the estimates of individual actions and collective activities. We demonstrate state-of-the-art performance of our algorithm on multiple publicly available benchmarks.

引用

页码：3425 / 3434

页数：10

共 45 条

[1]

ABADI M, 2015, TENSORFLOW LARGE SCA, DOI DOI 10.48550/ARXIV.1605.08695

[2]

Amer MR, 2014, LECT NOTES COMPUT SC, V8694, P572, DOI 10.1007/978-3-319-10599-4_37

[3]

[Anonymous], 2015, PROC CVPR IEEE, DOI 10.1109/CVPR.2015.7298642

[4]

Bagautdinov T, 2015, PROC CVPR IEEE, P2829, DOI 10.1109/CVPR.2015.7298900

[5]

Baque P., 2016, IEEE C COMP VIS PATT

[6] On Detection of Multiple Object Instances Using Hough Transforms [J].

Barinova, Olga ;

Lempitsky, Victor ;

Kholi, Pushmeet .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (09) :1773-1784

[7] Understanding Collective Activities of People from Videos [J].

Choi, Wongun ;

Savarese, Silvio .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (06) :1242-1257

[8]

Choi W, 2012, LECT NOTES COMPUT SC, V7575, P215, DOI 10.1007/978-3-642-33765-9_16

[9]

Chung Junyoung, 2014, Empirical evaluation of gated recurrent neural networks on sequence modeling

[10] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

← 1 2 3 4 5 →