Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising

被引:0
|
作者
Zhou, Kanglei [1 ]
Shum, Hubert P. H. [2 ]
Li, Frederick W. B. [2 ]
Liang, Xiaohui [1 ,3 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
[2] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England
[3] Zhongguancun Lab, Beijing 100081, Peoples R China
基金
中国国家自然科学基金; 英国工程与自然科学研究理事会;
关键词
Graph convolutional network; hand motion denoising; hand motion prediction; multi-task learning; GENERATIVE ADVERSARIAL NETWORK;
D O I
10.1109/TVCG.2023.3337868
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the interaction. Using only current motion for interaction can lead to lag, so predicting future movement is crucial for a faster response. Our solution is the Multi-task Spatial-Temporal Graph Auto-Encoder (Multi-STGAE), a model that accurately denoises and predicts hand motion by exploiting the inter-dependency of both tasks. The model ensures a stable and accurate prediction through denoising while maintaining motion dynamics to avoid over-smoothed motion and alleviate time delays through prediction. A gate mechanism is integrated to prevent negative transfer between tasks and further boost multi-task performance. Multi-STGAE also includes a spatial-temporal graph autoencoder block, which models hand structures and motion coherence through graph convolutional networks, reducing noise while preserving hand physiology. Additionally, we design a novel hand partition strategy and hand bone loss to improve natural hand motion generation. We validate the effectiveness of our proposed method by contributing two large-scale datasets with a data corruption algorithm based on two benchmark datasets. To evaluate the natural characteristics of the denoised and predicted hand motion, we propose two structural metrics. Experimental results show that our method outperforms the state-of-the-art, showcasing how the multi-task framework enables mutual benefits between denoising and prediction.
引用
收藏
页码:6754 / 6769
页数:16
相关论文
共 34 条
  • [1] STGAE: Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising
    Zhou, Kanglei
    Cheng, Zhiyuan
    Shum, Hubert P. H.
    Li, Frederick W. B.
    Liang, Xiaohui
    2021 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR 2021), 2021, : 41 - 49
  • [2] Representation learning with deep sparse auto-encoder for multi-task learning
    Zhu, Yi
    Wu, Xindong
    Qiang, Jipeng
    Hu, Xuegang
    Zhang, Yuhong
    Li, Peipei
    PATTERN RECOGNITION, 2022, 129
  • [3] Multi-Task Spatial-Temporal Graph Attention Network for Taxi Demand Prediction
    Wu, Mingming
    Zhu, Chaochao
    Chen, Lianliang
    2020 5TH INTERNATIONAL CONFERENCE ON MATHEMATICS AND ARTIFICIAL INTELLIGENCE (ICMAI 2020), 2020, : 224 - 228
  • [4] Deep Auto-encoder Based Multi-task Learning Using Probabilistic Transcriptions
    Das, Amit
    Hasegawa-Johnson, Mark
    Vesely, Karel
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2073 - 2077
  • [5] Multi-Task Spatial-Temporal Transformer for Multi-Variable Meteorological Forecasting
    Li, Tian-Bao
    Liu, An-An
    Song, Dan
    Li, Wen-Hui
    Zhang, Jing
    Wei, Zhi-Qiang
    Su, Yu-Ting
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 8876 - 8888
  • [6] Multi-task Adversarial Spatial-Temporal Networks for Crowd Flow Prediction
    Wang, Senzhang
    Miao, Hao
    Chen, Hao
    Huang, Zhiqiu
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1555 - 1564
  • [7] Time series prediction for production quality in a machining system using spatial-temporal multi-task graph learning
    Wang, Pei
    Zhang, Qianle
    Qu, Hai
    Xu, Xun
    Yang, Sheng
    JOURNAL OF MANUFACTURING SYSTEMS, 2024, 74 : 157 - 179
  • [8] MT-FiST: A Multi-Task Fine-Grained Spatial-Temporal Framework for Surgical Action Triplet Recognition
    Li, Yuchong
    Xia, Tong
    Luo, Huoling
    He, Baochun
    Jia, Fucang
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (10) : 4983 - 4994
  • [9] ADST: Forecasting Metro Flow Using Attention-Based Deep Spatial-Temporal Networks with Multi-Task Learning
    Jia, Hongwei
    Luo, Haiyong
    Wang, Hao
    Zhao, Fang
    Ke, Qixue
    Wu, Mingyao
    Zhao, Yunyun
    SENSORS, 2020, 20 (16) : 1 - 23
  • [10] Spatial-temporal multi-task learning for short-term passenger inflow and outflow prediction on holidays in urban rail transit systems
    Qiu, Hao
    Zhang, Jinlei
    Yang, Lixing
    Han, Kuo
    Yang, Xiaobao
    Gao, Ziyou
    TRANSPORTATION, 2025,