Adversarial multi-task deep learning for signer-independent feature representation

被引:3
作者
Fang, Yuchun [1 ]
Xiao, Zhengye [1 ]
Cai, Sirui [1 ]
Ni, Lan [2 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Coll Liberal Arts, Shanghai 200444, Peoples R China
基金
中国国家自然科学基金; 上海市自然科学基金;
关键词
Sign language recognition; Multi-task learning; Deep learning;
D O I
10.1007/s10489-022-03649-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous research has achieved remarkable progress in Sign Language Recognition (SLR). However, for robust open-set SLR applications, it is necessary to solve signer-independent SLR. This paper proposes a novel adversarial multi-task deep learning (MTL) framework that can incorporate multiple modalities for isolated SLR. Employing the identity recognition task as the competition task to the target SLR task, the proposed model can effectively extract signer-independent features by deviating the optimization direction of the competitive task. Furthermore, the proposed adversarial MTL multi-modality framework can jointly incorporate positive and negative task learning with the target task. Combining multi-modality in the adversarial MTL, our model can extract robust signer-independent representation. We evaluate our method on multiple benchmark datasets from different sign languages. The experimental results demonstrate that the proposed adversarial MTL multi-modality model can effectively realize signer-independent SLR by compensation with relevant tasks and competition with irrelevant tasks.
引用
收藏
页码:4380 / 4392
页数:13
相关论文
共 62 条
  • [31] Research on Heavy Haul Train Protection Algorithm Based on Online Parameter Identification
    Liu, Yu
    Wei, Guodong
    Qiao, Zheng
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
  • [32] Meng Z, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5949, DOI 10.1109/ICASSP.2018.8461682
  • [33] Mullick K, 2017, IEEE IMAGE PROC, P3998, DOI 10.1109/ICIP.2017.8297033
  • [34] Iterative Alignment Network for Continuous Sign Language Recognition
    Pu, Junfu
    Zhou, Wengang
    Li, Houqiang
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4160 - 4169
  • [35] Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
    Qiu, Zhaofan
    Yao, Ting
    Mei, Tao
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5534 - 5542
  • [36] Sign Language Recognition: A Deep Survey
    Rastgoo, Razieh
    Kiani, Kourosh
    Escalera, Sergio
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 164
  • [37] Romera-Paredes B., 2012, AISTATS
  • [38] Adversarial Multi-task Learning of Deep Neural Networks for Robust Speech Recognition
    Shinohara, Yusuke
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2369 - 2372
  • [39] Song LX, 2018, AAAI CONF ARTIF INTE, P7355
  • [40] Gate-Shift Networks for Video Action Recognition
    Sudhakaran, Swathikiran
    Escalera, Sergio
    Lanz, Oswald
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1099 - 1108