SKIM: Skeleton-Based Isolated Sign Language Recognition With Part Mixing

被引:4
|
作者
Lin, Kezhou [1 ]
Wang, Xiaohan [2 ]
Zhu, Linchao [1 ]
Zhang, Bang [3 ]
Yang, Yi [1 ]
机构
[1] Zhejiang Univ, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou 311121, Peoples R China
关键词
Sign language; Face recognition; Biological system modeling; Manuals; Benchmark testing; Assistive technologies; Data augmentation; sign language recognition; skeleton; MODEL;
D O I
10.1109/TMM.2023.3321502
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we present skeleton-based isolated sign language recognition (IsoSLR) with part mixing - SKIM. An IsoSLR model that solely takes the skeleton representation of the human body as input. Previous skeleton-based works either perform worse when compared to RGB-based counterparts or require fusion with other modalities to obtain competitive results. With SKIM, a single skeleton-based model without complex pre-training can obtain similar or even higher accuracy than current state-of-the-art methods. This margin can be further increased by simple late fusion within the same modality. To achieve this, we first develop a novel data augmentation technique called part mixing. It swaps the corresponding keypoints within one region (e.g. hand) between two randomly selected samples and combines their labels linearly as the new label. As regions like hand and face are key articulators for sign language, direct swapping of such parts creates a believable pseudo sign that promotes the model to recognize the true pairs. Secondly, following current advances in skeleton-based action recognition, we devise a channel-wise graph neural network with multi-scale awareness and per-keypoint temporal re-weighting. With this design, the backbone is capable of leveraging both manual and non-manual features. The combination of hand mixing and the channel-wise multi-scale GCN backbone allows us to achieve state-of-the-art accuracy on both WLASL and NMFs-CSL benchmarks.
引用
收藏
页码:4271 / 4280
页数:10
相关论文
共 50 条
  • [41] Deformable graph convolutional transformer for skeleton-based action recognition
    Shuo Chen
    Ke Xu
    Bo Zhu
    Xinghao Jiang
    Tanfeng Sun
    Applied Intelligence, 2023, 53 : 15390 - 15406
  • [42] A Short Survey on Deep Learning for Skeleton-based Action Recognition
    Wang, Wei
    Zhang, Yu-Dong
    COMPANION PROCEEDINGS OF THE 14TH IEEE/ACM INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC'21 COMPANION), 2021,
  • [43] Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition
    Du, Yong
    Fu, Yun
    Wang, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3010 - 3022
  • [44] JOINTS RELATION INFERENCE NETWORK FOR SKELETON-BASED ACTION RECOGNITION
    Ye, Fanfan
    Tang, Huiming
    Wang, Xuwen
    Liang, Xiao
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 16 - 20
  • [45] Structural Knowledge Distillation for Efficient Skeleton-Based Action Recognition
    Bian, Cunling
    Feng, Wei
    Wan, Liang
    Wang, Song
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2963 - 2976
  • [46] Deformable graph convolutional transformer for skeleton-based action recognition
    Chen, Shuo
    Xu, Ke
    Zhu, Bo
    Jiang, Xinghao
    Sun, Tanfeng
    APPLIED INTELLIGENCE, 2023, 53 (12) : 15390 - 15406
  • [47] Skeleton-Based Attention Mask for Pedestrian Attribute Recognition Network
    Sooksatra, Sorn
    Rujikietgumjorn, Sitapa
    JOURNAL OF IMAGING, 2021, 7 (12)
  • [48] Hierarchical Graph Convolutional Network for Skeleton-Based Action Recognition
    Huang, Linjiang
    Huang, Yan
    Ouyang, Wanli
    Wang, Liang
    IMAGE AND GRAPHICS, ICIG 2019, PT I, 2019, 11901 : 93 - 102
  • [49] Symmetrical Enhanced Fusion Network for Skeleton-Based Action Recognition
    Kong, Jun
    Deng, Haoyang
    Jiang, Min
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (11) : 4394 - 4408
  • [50] Contrastive Learning with Cross-Part Bidirectional Distillation for Self-supervised Skeleton-Based Action Recognition
    Yang, Huaigang
    Zhang, Qieshi
    Ren, Ziliang
    Yuan, Huaqiang
    Zhang, Fuyong
    HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2024, 14