Simultaneous multi-person tracking and activity recognition based on cohesive cluster search

被引:5
作者
Li, Wenbo [1 ]
Wei, Yi [1 ]
Lyu, Siwei [3 ]
Chang, Ming-Ching [2 ]
机构
[1] Samsung Res AI Ctr, 665 Clyde Ave, Mountain View, CA USA
[2] SUNY Albany, 1400 Washington Ave, Albany, NY 12222 USA
[3] SUNY Buffalo, 12 Capen Hall, Buffalo, NY 14260 USA
关键词
Group activity; Collective activity recognition; Pairwise interaction; Multi-person tracking;
D O I
10.1016/j.cviu.2021.103301
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a bootstrapping framework to simultaneously improve multi-person tracking and activity recognition at individual, interaction and social group activity levels. The inference consists of identifying trajectories of all pedestrian actors, individual activities, pairwise interactions, and collective activities, given the observed pedestrian detections. Our method uses a graphical model to represent and solve the joint tracking and recognition problems via three stages: (i) activity-aware tracking, (ii) joint interaction recognition and occlusion recovery, and (iii) collective activity recognition. This full-stack problem induces great complexity in learning the representations for the sub-problems at each stage, and the complexity increases as with more stages in the system. Our solution is to make use of symbolic cues for inference at higher stages, inspired by the observations of cohesive clusters at different stages. This also avoids learning more ambiguous representations in the higher stages. High-order correlations among the visible and occluded individuals, pairwise interactions, groups, and activities are then solved using the cohesive cluster search within a Bayesian framework. Experiments on several benchmarks show the advantages of our approach over the existing methods.
引用
收藏
页数:13
相关论文
共 31 条
[1]   Human Activity Analysis: A Review [J].
Aggarwal, J. K. ;
Ryoo, M. S. .
ACM COMPUTING SURVEYS, 2011, 43 (03)
[2]   Monte Carlo Tree Search for Scheduling Activity Recognition [J].
Amer, Mohamed R. ;
Todorovic, Sinisa ;
Fern, Alan ;
Zhu, Song-Chun .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :1353-1360
[3]  
Amer MR, 2014, LECT NOTES COMPUT SC, V8694, P572, DOI 10.1007/978-3-319-10599-4_37
[4]  
Antic B, 2014, LECT NOTES COMPUT SC, V8689, P33, DOI 10.1007/978-3-319-10590-1_3
[5]   Convolutional Relational Machine for Group Activity Recognition [J].
Azar, Sina Mokhtarzadeh ;
Atigh, Mina Ghadimi ;
Nickabadi, Ahmad ;
Alahi, Alexandre .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7884-7893
[6]   Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition [J].
Bagautdinov, Timur ;
Alahi, Alexandre ;
Fleuret, Francois ;
Fua, Pascal ;
Savarese, Silvio .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3425-3434
[7]   A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection [J].
Cai, Zhaowei ;
Fan, Quanfu ;
Feris, Rogerio S. ;
Vasconcelos, Nuno .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :354-370
[8]  
Chang MC, 2011, IEEE I CONF COMP VIS, P747, DOI 10.1109/ICCV.2011.6126312
[9]   Understanding Collective Activities of People from Videos [J].
Choi, Wongun ;
Savarese, Silvio .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (06) :1242-1257
[10]  
Choi W, 2012, LECT NOTES COMPUT SC, V7575, P215, DOI 10.1007/978-3-642-33765-9_16