Learning to Detect and Track Visible and Occluded Body Joints in a Virtual World

被引:107
作者
Fabbri, Matteo [1 ]
Lanzi, Fabio [1 ]
Calderara, Simone [1 ]
Palazzi, Andrea [1 ]
Vezzani, Roberto [1 ]
Cucchiara, Rita [1 ]
机构
[1] Univ Modena & Reggio Emilia, Dept Engn Enzo Ferrari, Modena, Italy
来源
COMPUTER VISION - ECCV 2018, PT IV | 2018年 / 11208卷
关键词
Pose estimation; Tracking; Surveillance; Occlusions;
D O I
10.1007/978-3-030-01225-0_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-People Tracking in an open-world setting requires a special effort in precise detection. Moreover, temporal continuity in the detection phase gains more importance when scene cluttering introduces the challenging problems of occluded targets. For the purpose, we propose a deep network architecture that jointly extracts people body parts and associates them across short temporal spans. Our model explicitly deals with occluded body parts, by hallucinating plausible solutions of not visible joints. We propose a new end-to-end architecture composed by four branches (visible heatmaps, occluded heatmaps, part affinity fields and temporal affinity fields) fed by a time linker feature extractor. To overcome the lack of surveillance data with tracking, body part and occlusion annotations we created the vastest Computer Graphics dataset for people tracking in urban scenarios by exploiting a photorealistic videogame. It is up to now the vastest dataset (about 500.000 frames, almost 10 million body poses) of human body parts for people tracking in urban scenarios. Our architecture trained on virtual data exhibits good generalization capabilities also on public real tracking benchmarks, when image resolution and sharpness are high enough, producing reliable tracklets useful for further batch data association or re-id modules.
引用
收藏
页码:450 / 466
页数:17
相关论文
共 41 条
[1]   People-tracking-by-detection and people-detection-by-tracking [J].
Andriluka, Mykhaylo ;
Roth, Stefan ;
Schiele, Bernt .
2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, :1873-1880
[2]   PoseTrack: A Benchmark for Human Pose Estimation and Tracking [J].
Andriluka, Mykhaylo ;
Iqbal, Umar ;
Insafutdinov, Eldar ;
Pishchulin, Leonid ;
Milan, Anton ;
Gall, Juergen ;
Schiele, Bernt .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5167-5176
[3]   2D Human Pose Estimation: New Benchmark and State of the Art Analysis [J].
Andriluka, Mykhaylo ;
Pishchulin, Leonid ;
Gehler, Peter ;
Schiele, Bernt .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3686-3693
[4]  
[Anonymous], 2016, ARXIV PREPRINT ARXIV
[5]  
[Anonymous], 2016, LECT NOTES COMPUT SC, DOI DOI 10.1007/978-3-319-46484-8_29
[6]  
[Anonymous], 2017, IEEE C COMP VIS PATT
[7]   Confidence-Based Data Association and Discriminative Deep Appearance Learning for Robust Online Multi-Object Tracking [J].
Bae, Seung-Hwan ;
Yoon, Kuk-Jin .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (03) :595-610
[8]   Human Pose Estimation via Convolutional Part Heatmap Regression [J].
Bulat, Adrian ;
Tzimiropoulos, Georgios .
COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :717-732
[9]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[10]   Human Pose Estimation with Iterative Error Feedback [J].
Carreira, Joao ;
Agrawal, Pulkit ;
Fragkiadaki, Katerina ;
Malik, Jitendra .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4733-4742