Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising

被引:0
|
作者
Zhou, Kanglei [1 ]
Shum, Hubert P. H. [2 ]
Li, Frederick W. B. [2 ]
Liang, Xiaohui [1 ,3 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
[2] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England
[3] Zhongguancun Lab, Beijing 100081, Peoples R China
基金
中国国家自然科学基金; 英国工程与自然科学研究理事会;
关键词
Graph convolutional network; hand motion denoising; hand motion prediction; multi-task learning; GENERATIVE ADVERSARIAL NETWORK;
D O I
10.1109/TVCG.2023.3337868
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the interaction. Using only current motion for interaction can lead to lag, so predicting future movement is crucial for a faster response. Our solution is the Multi-task Spatial-Temporal Graph Auto-Encoder (Multi-STGAE), a model that accurately denoises and predicts hand motion by exploiting the inter-dependency of both tasks. The model ensures a stable and accurate prediction through denoising while maintaining motion dynamics to avoid over-smoothed motion and alleviate time delays through prediction. A gate mechanism is integrated to prevent negative transfer between tasks and further boost multi-task performance. Multi-STGAE also includes a spatial-temporal graph autoencoder block, which models hand structures and motion coherence through graph convolutional networks, reducing noise while preserving hand physiology. Additionally, we design a novel hand partition strategy and hand bone loss to improve natural hand motion generation. We validate the effectiveness of our proposed method by contributing two large-scale datasets with a data corruption algorithm based on two benchmark datasets. To evaluate the natural characteristics of the denoised and predicted hand motion, we propose two structural metrics. Experimental results show that our method outperforms the state-of-the-art, showcasing how the multi-task framework enables mutual benefits between denoising and prediction.
引用
收藏
页码:6754 / 6769
页数:16
相关论文
共 34 条
  • [21] GSTGM: Graph, spatial-temporal attention and generative based model for pedestrian multi-path prediction
    Khel, Muhammad Haris Kaka
    Greaney, Paul
    McAfee, Marion
    Moffett, Sandra
    Meehan, Kevin
    IMAGE AND VISION COMPUTING, 2024, 151
  • [22] Spatial and temporal saliency based four-stream network with multi-task learning for action recognition
    Zong, Ming
    Wang, Ruili
    Ma, Yujun
    Ji, Wanting
    APPLIED SOFT COMPUTING, 2023, 132
  • [23] STA-GCN: two-stream graph convolutional network with spatial-temporal attention for hand gesture recognition
    Zhang, Wei
    Lin, Zeyi
    Cheng, Jian
    Ma, Cuixia
    Deng, Xiaoming
    Wang, Hongan
    VISUAL COMPUTER, 2020, 36 (10-12) : 2433 - 2444
  • [24] A motion-aware and temporal-enhanced Spatial-Temporal Graph Convolutional Network for skeleton-based human action segmentation
    Chai, Shurong
    Jain, Rahul Kumar
    Liu, Jiaqing
    Teng, Shiyu
    Tateyama, Tomoko
    Li, Yinhao
    Chen, Yen -Wei
    NEUROCOMPUTING, 2024, 580
  • [25] Multi-branch spatial-temporal-spectral convolutional neural networks for multi-task motor imagery EEG classification
    Cai, Zikun
    Luo, Tian-jian
    Cao, Xuan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 93
  • [26] Attention Based Multi-scale Spatial-temporal Fusion Propagation Graph Network for Traffic Flow Prediction
    Tian, Yuxin
    Zhang, Qiliang
    Li, Xiaomeng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT II, ICIC 2024, 2024, 14876 : 125 - 136
  • [27] Multi-Step Passenger Flow Prediction for Urban Metro System Based on Spatial-Temporal Graph Neural Network
    Chang, Yuchen
    Zong, Mengya
    Dang, Yutian
    Wang, Kaiping
    APPLIED SCIENCES-BASEL, 2024, 14 (18):
  • [28] Short-term power load forecasting based on spatial-temporal dynamic graph and multi-scale Transformer
    Zhu, Li
    Gao, Jingkai
    Zhu, Chunqiang
    Deng, Fan
    JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING, 2025, 12 (02) : 92 - 111
  • [29] MSSTGCN: Multi-Head Self-Attention and Spatial-Temporal Graph Convolutional Network for Multi-Scale Traffic Flow Prediction
    Zong, Xinlu
    Yu, Fan
    Chen, Zhen
    Xia, Xue
    CMC-COMPUTERS MATERIALS & CONTINUA, 2025, 82 (02): : 3517 - 3537
  • [30] Multi-task learning for gait-based identity recognition and emotion recognition using attention enhanced temporal graph convolutional network
    Sheng, Weijie
    Li, Xinde
    PATTERN RECOGNITION, 2021, 114