Leveraging Two Kinect Sensors for Accurate Full-Body Motion Capture

被引:32
作者
Gao, Zhiquan [1 ]
Yu, Yao [1 ]
Zhou, Yu [1 ]
Du, Sidan [1 ]
机构
[1] Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210046, Jiangsu, Peoples R China
关键词
motion capture; pose estimation; temporal constraint; Kinect sensors; HUMAN POSE; DEPTH; TRACKING; IMAGE;
D O I
10.3390/s150924297
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Accurate motion capture plays an important role in sports analysis, the medical field and virtual reality. Current methods for motion capture often suffer from occlusions, which limits the accuracy of their pose estimation. In this paper, we propose a complete system to measure the pose parameters of the human body accurately. Different from previous monocular depth camera systems, we leverage two Kinect sensors to acquire more information about human movements, which ensures that we can still get an accurate estimation even when significant occlusion occurs. Because human motion is temporally constant, we adopt a learning analysis to mine the temporal information across the posture variations. Using this information, we estimate human pose parameters accurately, regardless of rapid movement. Our experimental results show that our system can perform an accurate pose estimation of the human body with the constraint of information from the temporal domain.
引用
收藏
页码:24297 / 24317
页数:21
相关论文
共 27 条
[1]   The space of human body shapes: reconstruction and parameterization from range scans [J].
Allen, B ;
Curless, B ;
Popovic, Z .
ACM TRANSACTIONS ON GRAPHICS, 2003, 22 (03) :587-594
[2]   SCAPE: Shape Completion and Animation of People [J].
Anguelov, D ;
Srinivasan, P ;
Koller, D ;
Thrun, S ;
Rodgers, J ;
Davis, J .
ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03) :408-416
[3]  
[Anonymous], 2011, VMV
[4]  
[Anonymous], 2012, MICROSOFT KINECT API
[5]  
Auvinet E., 2012, 2012 11th International Conference on Information Sciences, Signal Processing and their Applications (ISSPA), P478, DOI 10.1109/ISSPA.2012.6310598
[6]   Lucas-Kanade 20 years on: A unifying framework [J].
Baker, S ;
Matthews, I .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 56 (03) :221-255
[7]   Performance capture from sparse multi-view video [J].
de Aguiar, Edilson ;
Stoll, Carsten ;
Theobalt, Christian ;
Ahmed, Naveed ;
Seidel, Hans-Peter ;
Thrun, Sebastian .
ACM TRANSACTIONS ON GRAPHICS, 2008, 27 (03)
[8]  
Desbrun M, 1999, COMP GRAPH, P317, DOI 10.1145/311535.311576
[9]   Temporal denoising of Kinect depth data [J].
Essmaeel, Kyis ;
Gallo, Luigi ;
Damiani, Ernesto ;
De Pietro, Giuseppe ;
Dipanda, Albert .
8TH INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS 2012), 2012, :47-52
[10]   Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains [J].
Gauvain, Jean-Luc ;
Lee, Chin-Hui .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :291-298