Skeleton-Based Action Segmentation With Multi-Stage Spatial-Temporal Graph Convolutional Neural Networks

被引:22
作者
Filtjens, Benjamin [1 ,2 ]
Vanrumste, Bart [3 ]
Slaets, Peter [1 ]
机构
[1] Katholieke Univ Leuven, Dept Mech Engn, B-3001 Leuven, Belgium
[2] Katholieke Univ Leuven, Dept Elect Engn ESAT, B-3001 Leuven, Belgium
[3] Katholieke Univ Leuven, Dept Elect Engn ESAT, B-3001 Leuven, Belgium
关键词
Activity segmentation; activity detection; dense labelling; freezing of gait; graph convolutional; MS-GCN; multi-stage; spatial-temporal; GAIT;
D O I
10.1109/TETC.2022.3230912
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The ability to identify and temporally segment fine-grained actions in motion capture sequences is crucial for applications in human movement analysis. Motion capture is typically performed with optical or inertial measurement systems, which encode human movement as a time series of human joint locations and orientations or their higher-order representations. State-of-the-art action segmentation approaches use multiple stages of temporal convolutions. The main idea is to generate an initial prediction with several layers of temporal convolutions and refine these predictions over multiple stages, also with temporal convolutions. Although these approaches capture long-term temporal patterns, the initial predictions do not adequately consider the spatial hierarchy among the human joints. To address this limitation, we recently introduced multi-stage spatial-temporal graph convolutional neural networks (MS-GCN). Our framework replaces the initial stage of temporal convolutions with spatial graph convolutions and dilated temporal convolutions, which better exploit the spatial configuration of the joints and their long-term temporal dynamics. Our framework was compared to four strong baselines on five tasks. Experimental results demonstrate that our framework is a strong baseline for skeleton-based action segmentation.
引用
收藏
页码:202 / 212
页数:11
相关论文
共 45 条
[1]   MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation [J].
Abu Farha, Yazan ;
Gall, Juergen .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3570-3579
[2]   Inertial Measurement Units for Clinical Movement Analysis: Reliability and Concurrent Validity [J].
Al-Amri, Mohammad ;
Nicholas, Kevin ;
Button, Kate ;
Sparkes, Valerie ;
Sheeran, Liba ;
Davies, Jennifer L. .
SENSORS, 2018, 18 (03)
[3]  
Ba J, 2014, ACS SYM SER
[4]  
Bai SJ, 2018, Arxiv, DOI [arXiv:1803.01271, DOI 10.48550/ARXIV.1803.01271]
[5]  
Borja C., 2016, R. I, V8, P248
[6]   Dual-task effects of talking while walking on velocity and balance following a stroke [J].
Bowen, A ;
Wenman, R ;
Mickelborough, J ;
Foster, J ;
Hill, E ;
Tallis, R .
AGE AND AGEING, 2001, 30 (04) :319-323
[7]   OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields [J].
Cao, Zhe ;
Hidalgo, Gines ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) :172-186
[8]   POSITION AND ORIENTATION IN-SPACE OF BONES DURING MOVEMENT - ANATOMICAL FRAME DEFINITION AND DETERMINATION [J].
CAPPOZZO, A ;
CATANI, F ;
DELLA CROCE, U ;
LEARDINI, A .
CLINICAL BIOMECHANICS, 1995, 10 (04) :171-178
[9]  
Cheema N., 2018, P C COMPUTER ANIMATI, P5, DOI DOI 10.2312/SCA.20181185.HTTPS
[10]   HuGaDB: Human Gait Database for Activity Recognition from Wearable Inertial Sensor Networks [J].
Chereshnev, Roman ;
Kertesz-Farkas, Attila .
ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS, AIST 2017, 2018, 10716 :131-141