Adaptive Spatial-Temporal Graph-Mixer for Human Motion Prediction

被引:0
作者
Yang, Shubo [1 ,2 ]
Li, Haolun [3 ]
Pun, Chi-Man [3 ]
Du, Chun [4 ]
Gao, Hao [1 ,2 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210000, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Coll Artificial Intelligence, Nanjing 210000, Peoples R China
[3] Univ Macau, Fac Sci & Technol, Dept Comp & Informat Sci, Macau 999078, Peoples R China
[4] Tibet Univ, Sch Sci, Tibet 850000, Peoples R China
关键词
Adaptive learning; human motion prediction; graph convolution; NETWORK;
D O I
10.1109/LSP.2024.3392686
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The Graph Convolutional Network (GCN) has recently achieved promising performance in human motion prediction by modeling the nodes and edges of the human skeleton. However, most previous methods still suffer from two unaddressed drawbacks. First, in the inference stage, their graph topologies are static and fixed, resulting in dependencies between nodes that cannot be dynamically adjusted for different actions. Second, the implicit relationships between pose sequences are ignored, which makes the prior advantages of the graph structure invalid in temporal feature fusion. To address these limitations, we propose an adaptive spatial-temporal graph-mixer (GraphMixer) for human motion prediction, which consists of a series of fully separated spatial-temporal graph convolution structures. In spatial GCN, we construct an additional adaptive skeleton graph to capture the node features of action-specific poses. In temporal GCN, we introduce a variety of graph topologies to enhance feature fusion between pose sequences. Comparing state-of-the-art algorithms on the Human 3.6 M and the 3 DPW datasets and ablation studies shows that our GraphMixer and the proposed multiple graph topologies are effective and critical.
引用
收藏
页码:1244 / 1248
页数:5
相关论文
共 32 条
[1]   A Spatio-temporal Transformer for 3D Human Motion Prediction [J].
Aksan, Emre ;
Kaufmann, Manuel ;
Cao, Peng ;
Hilliges, Otmar .
2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, :565-574
[2]  
Bouazizi A, 2022, PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, P791
[3]   Style machines [J].
Brand, M ;
Hertzmann, A .
SIGGRAPH 2000 CONFERENCE PROCEEDINGS, 2000, :183-192
[4]   Deep representation learning for human motion prediction and classification [J].
Butepage, Judith ;
Black, Michael J. ;
Kragic, Danica ;
Kjellstrom, Hedvig .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1591-1599
[5]   Efficient human motion prediction using temporal convolutional generative adversarial network [J].
Cui, Qiongjie ;
Sun, Huaijiang ;
Kong, Yue ;
Zhang, Xiaoqian ;
Li, Yanmeng .
INFORMATION SCIENCES, 2021, 545 :427-447
[6]   MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction [J].
Dang, Lingwei ;
Nie, Yongwei ;
Long, Chengjiang ;
Zhang, Qing ;
Li, Guiqing .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :11447-11456
[7]   Recurrent Network Models for Human Dynamics [J].
Fragkiadaki, Katerina ;
Levine, Sergey ;
Felsen, Panna ;
Malik, Jitendra .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4346-4354
[8]  
Glorot X., 2011, JMLR Workshop and Conference Proceedings, P315
[9]   Back to MLP: A Simple Baseline for Human Motion Prediction [J].
Guo, Wen ;
Du, Yuming ;
Shen, Xi ;
Lepetit, Vincent ;
Alameda-Pineda, Xavier ;
Moreno-Noguer, Francesc .
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, :4798-4808
[10]   Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments [J].
Ionescu, Catalin ;
Papava, Dragos ;
Olaru, Vlad ;
Sminchisescu, Cristian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (07) :1325-1339