SKIM: Skeleton-Based Isolated Sign Language Recognition With Part Mixing

被引:4
|
作者
Lin, Kezhou [1 ]
Wang, Xiaohan [2 ]
Zhu, Linchao [1 ]
Zhang, Bang [3 ]
Yang, Yi [1 ]
机构
[1] Zhejiang Univ, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou 311121, Peoples R China
关键词
Sign language; Face recognition; Biological system modeling; Manuals; Benchmark testing; Assistive technologies; Data augmentation; sign language recognition; skeleton; MODEL;
D O I
10.1109/TMM.2023.3321502
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we present skeleton-based isolated sign language recognition (IsoSLR) with part mixing - SKIM. An IsoSLR model that solely takes the skeleton representation of the human body as input. Previous skeleton-based works either perform worse when compared to RGB-based counterparts or require fusion with other modalities to obtain competitive results. With SKIM, a single skeleton-based model without complex pre-training can obtain similar or even higher accuracy than current state-of-the-art methods. This margin can be further increased by simple late fusion within the same modality. To achieve this, we first develop a novel data augmentation technique called part mixing. It swaps the corresponding keypoints within one region (e.g. hand) between two randomly selected samples and combines their labels linearly as the new label. As regions like hand and face are key articulators for sign language, direct swapping of such parts creates a believable pseudo sign that promotes the model to recognize the true pairs. Secondly, following current advances in skeleton-based action recognition, we devise a channel-wise graph neural network with multi-scale awareness and per-keypoint temporal re-weighting. With this design, the backbone is capable of leveraging both manual and non-manual features. The combination of hand mixing and the channel-wise multi-scale GCN backbone allows us to achieve state-of-the-art accuracy on both WLASL and NMFs-CSL benchmarks.
引用
收藏
页码:4271 / 4280
页数:10
相关论文
共 50 条
  • [31] STAR: An STGCN ARchitecture for Skeleton-Based Human Action Recognition
    Wu, Weiwei
    Tu, Fengbin
    Niu, Mengqi
    Yue, Zhiheng
    Liu, Leibo
    Wei, Shaojun
    Li, Xiangyu
    Hu, Yang
    Yin, Shouyi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (06) : 2370 - 2383
  • [32] Skeleton-Based Gesture Recognition With Learnable Paths and Signature Features
    Cheng, Jiale
    Shi, Dongzi
    Li, Chenyang
    Li, Yu
    Ni, Hao
    Jin, Lianwen
    Zhang, Xin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 3951 - 3961
  • [33] Selective Hypergraph Convolutional Networks for Skeleton-based Action Recognition
    Zhu, Yiran
    Huang, Guangji
    Xu, Xing
    Ji, Yanli
    Shen, Fumin
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 518 - 526
  • [34] MOTION-LET CLUSTERING FOR SKELETON-BASED ACTION RECOGNITION
    Yang, Jianyu
    Zhu, Chen
    Yuan, Junsong
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2019, : 150 - 155
  • [35] A Cross View Learning Approach for Skeleton-Based Action Recognition
    Zheng, Hui
    Zhang, Xinming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3061 - 3072
  • [36] Isolated Sign Language Recognition with Depth Cameras
    Oszust, Mariusz
    Krupski, Jakub
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 2085 - 2094
  • [37] Improved Key Poses Model for Skeleton-Based Action Recognition
    Li, Xiaoqiang
    Zhang, Yi
    Zhang, Junhui
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT II, 2018, 10736 : 358 - 367
  • [38] Skeleton-Based Action Recognition With Gated Convolutional Neural Networks
    Cao, Congqi
    Lan, Cuiling
    Zhang, Yifan
    Zeng, Wenjun
    Lu, Hanqing
    Zhang, Yanning
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (11) : 3247 - 3257
  • [39] GRAPH CONVOLUTIONAL LSTM MODEL FOR SKELETON-BASED ACTION RECOGNITION
    Zhang, Han
    Song, Yonghong
    Zhang, Yuanlin
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 412 - 417
  • [40] Feedback Graph Convolutional Network for Skeleton-Based Action Recognition
    Yang, Hao
    Yan, Dan
    Zhang, Li
    Sun, Yunda
    Li, Dong
    Maybank, Stephen J.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 164 - 175