Semantic Correlation Attention-Based Multiorder Multiscale Feature Fusion Network for Human Motion Prediction

被引：3

作者：

Li, Qin ^{[1
]}

Wang, Yong ^{[1
]}

Lv, Fanbing ^{[2
]}

机构：

[1] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China

[2] Changsha Hisense Intelligent Syst Res Inst Co Ltd, Res & Dev Dept, Changsha 410208, Peoples R China

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2024年 / 54卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Semantics; Feature extraction; Correlation; Joints; Predictive models; Bones; Decoding; Feature fusion; graph neural network (GNN); human motion prediction; semantic correlation attention;

D O I：

10.1109/TCYB.2022.3184977

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Human motion prediction is to predict future human states based on the observed human states. However, current research ignores the semantic correlations between body parts (joints and bones) in the observed human states and motion time; thus, the prediction accuracy is limited. To address this issue, we propose a novel semantic correlation attention-based multiorder multiscale feature fusion network (SCAFF), which includes an encoder and a decoder. In the encoder, a multiorder difference calculation module (MODC) is designed to calculate the multiorder difference information of joint and bone attributes in the observed human states. Then, multiple semantic correlation attention-based graph calculation operators (SCA-GCOs) are stacked to extract the multiscale features of the multiorder difference information. Each SCA-GCO captures joint and bone dependencies of the multiorder difference information, refines them with a semantic correlation attention module (SCAM), and captures temporal dynamics of the refined joint and bone dependencies as the output features. Note that SCAM learns a semantic attention mask describing the semantic correlations between body parts and motion time for feature refinement. Afterward, multiple multiorder feature fusion modules (MOFFs) and multiscale feature fusion modules (MSFFs) are designed to fuse the multiscale features of the multiorder difference information extracted by multiple SCA-GCOs, thus obtaining the motion features of the observed human states. Based on the obtained motion features, the decoder recurrently recruits a composite gated recurrent module (CGRM) and multilayer perceptrons (MLPs) to predict future human states. As far as we know, this is the first attempt to consider the semantic correlations between body parts and motion time in human motion prediction. The results on public datasets demonstrate that SCAFF outperforms existing models.

引用

页码：825 / 838

页数：14

共 39 条

[1] DISCRETE COSINE TRANSFORM
AHMED, N
NATARAJAN, T
RAO, KR
[J]. IEEE TRANSACTIONS ON COMPUTERS, 1974, C 23 (01) : 90 - 93
[2] Long-Term On-Board Prediction of People in Traffic Scenes under Uncertainty
Bhattacharyya, Apratim
Fritz, Mario
Schiele, Bernt
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4194 - 4202
[3] Pedestrian Models for Autonomous Driving Part II: High-Level Models of Human Behavior
Camara, Fanta
Bellotto, Nicola
Cosar, Serhan
Weber, Florian
Nathanael, Dimitris
Althoff, Matthias
Wu, Jingyuan
Ruenz, Johannes
Dietrich, Andre
Markkula, Gustav
Schieben, Anna
Tango, Fabio
Merat, Natasha
Fox, Charles
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (09) : 5453 - 5472
[4] Chalavadi G, 2021, INT J COAL PREP UTIL, V41, P67, DOI [10.1080/19392699.2018.1448799, 10.1109/LRA.2018.2800084]
[5] Chao X., 2020, PROC ASIAN C COMPUT, P1
[6] Action-Agnostic Human Pose Forecasting
Chiu, Hsu-kuang
Adeli, Ehsan
Wang, Borui
Huang, De-An
Niebles, Juan Carlos
[J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1423 - 1432
[7] Efficient human motion prediction using temporal convolutional generative adversarial network
Cui, Qiongjie
Sun, Huaijiang
Kong, Yue
Zhang, Xiaoqian
Li, Yanmeng
[J]. INFORMATION SCIENCES, 2021, 545 : 427 - 447
[8] Learning Dynamic Relationships for 3D Human Motion Prediction
Cui, Qiongjie
Sun, Huaijiang
Yang, Fei
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6518 - 6526
[9] Recurrent Network Models for Human Dynamics
Fragkiadaki, Katerina
Levine, Sergey
Felsen, Panna
Malik, Jitendra
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4346 - 4354
[10] Dual Attention Network for Scene Segmentation
Fu, Jun
Liu, Jing
Tian, Haijie
Li, Yong
Bao, Yongjun
Fang, Zhiwei
Lu, Hanqing
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149

← 1 2 3 4 →