KSOF: Leveraging kinematics and spatio-temporal optimal fusion for human motion prediction

被引:0
|
作者
Ding, Rui [1 ]
Qu, Kehua [1 ]
Tang, Jin [2 ]
机构
[1] Capital Normal Univ, Informat Engn Coll, Beijing 100048, Peoples R China
[2] Beijing Univ Posts & Telecommun, Sch Intelligent Engn & Automat, Beijing 100876, Peoples R China
关键词
Human motion prediction; Kinematic constraints; Spatio-temporal optimal fusion;
D O I
10.1016/j.patcog.2024.111206
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ignoring the meaningful kinematics law, which generates improbable or impractical predictions, is one of the obstacles to human motion prediction. Current methods attempt to tackle this problem by taking simple kinematics information as auxiliary features to improve predictions. However, it remains challenging to utilize human prior knowledge deeply, such as the trajectory formed by the same joint should be smooth and continuous in this task. In this paper, we advocate explicitly describing kinematics information via velocity and acceleration by proposing a novel loss called joint point smoothness (JPS) loss, which calculates the acceleration of joints to smooth the sudden change in joint velocity. In addition, capturing spatio-temporal dependencies to make feature representations more informative is also one of the obstacles in this task. Therefore, we propose a dual-path network (KSOF) that models the temporal and spatial dependencies from kinematic temporal convolutional network (K-TCN) and spatial graph convolutional networks (S-GCN), respectively. Moreover, we propose a novel multi-scale fusion module named spatio-temporal optimal fusion (SOF) to enhance extraction of the essential correlation and important features at different scales from spatiotemporal coupling features. We evaluate our approach on three standard benchmark datasets, including Human3.6M, CMU-Mocap, and 3DPW datasets. For both short-term and long-term predictions, our method achieves outstanding performance on all these datasets. The code is available at https://github.com/qukehua/ KSOF.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] A Large-Scale Spatio-Temporal Multimodal Fusion Framework for Traffic Prediction
    Zhou, Bodong
    Liu, Jiahui
    Cui, Songyi
    Zhao, Yaping
    BIG DATA MINING AND ANALYTICS, 2024, 7 (03): : 621 - 636
  • [42] STFGCN: Spatio-Temporal Fusion Graph Convolutional Networks for Subway Traffic Prediction
    Zhang, Xiaoxi
    Tian, Zhanwei
    Shi, Yan
    Guan, Qingwen
    Lu, Yan
    Pan, Yujie
    IEEE ACCESS, 2024, 12 : 194449 - 194461
  • [43] Dynamic Spatio-Temporal Graph Fusion Convolutional Network for Urban Traffic Prediction
    Ma, Haodong
    Qin, Xizhong
    Jia, Yuan
    Zhou, Junwei
    APPLIED SCIENCES-BASEL, 2023, 13 (16):
  • [44] Parking Lot Traffic Prediction Based on Fusion of Multifaceted Spatio-Temporal Features
    Zhang, Lechuan
    Wang, Bin
    Zhang, Qian
    Zhu, Sulei
    Ma, Yan
    SENSORS, 2024, 24 (15)
  • [45] Spatio-temporal Multi-level Fusion for Human Action Recognition
    Manh-Hung Lu
    Thi-Oanh Nguyen
    SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 298 - 305
  • [46] Spatio-temporal Fusion of Transformer and Global Feature Mining for Traffic Flow Prediction
    Meng, Xiangfu
    Bai, Yanbo
    Li, Minghao
    Cai, Ziang
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VI, ICIC 2024, 2024, 14880 : 146 - 157
  • [47] FuSSI-Net: Fusion of Spatio-temporal Skeletons for Intention Prediction Network
    Piccoli, Francesco
    Balakrishnan, Rajarathnam
    Perez, Maria Jesus
    Sachdeo, Moraldeepsingh
    Nunez, Carlos
    Tang, Matthew
    Andreasson, Kajsa
    Bjurek, Kalle
    Raj, Ria Dass
    Davidsson, Ebba
    Eriksson, Colin
    Hagman, Victor
    Sjoberg, Jonas
    Li, Ying
    Muppirisetty, L. Srikar
    Roychowdhury, Sohini
    2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 68 - 72
  • [48] Adaptive Spatio-Temporal Graph Information Fusion for Remaining Useful Life Prediction
    Zhang, Yuxuan
    Li, Yuanxiang
    Wang, Yilin
    Yang, Yongshen
    Wei, Xian
    IEEE SENSORS JOURNAL, 2022, 22 (04) : 3334 - 3347
  • [49] Human Action Recognition in Video by Fusion of Structural and Spatio-temporal Features
    Borzeshi, Ehsan Zare
    Concha, Oscar Perez
    Piccardi, Massimo
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2012, 7626 : 474 - 482
  • [50] Sharpening image motion based on the spatio-temporal characteristics of human vision
    Takeuchi, T
    De Valois, KK
    HUMAN VISION AND ELECTRONIC IMAGING X, 2005, 5666 : 83 - 94