Adversarial multi-task deep learning for signer-independent feature representation

被引:3
作者
Fang, Yuchun [1 ]
Xiao, Zhengye [1 ]
Cai, Sirui [1 ]
Ni, Lan [2 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Coll Liberal Arts, Shanghai 200444, Peoples R China
基金
中国国家自然科学基金; 上海市自然科学基金;
关键词
Sign language recognition; Multi-task learning; Deep learning;
D O I
10.1007/s10489-022-03649-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous research has achieved remarkable progress in Sign Language Recognition (SLR). However, for robust open-set SLR applications, it is necessary to solve signer-independent SLR. This paper proposes a novel adversarial multi-task deep learning (MTL) framework that can incorporate multiple modalities for isolated SLR. Employing the identity recognition task as the competition task to the target SLR task, the proposed model can effectively extract signer-independent features by deviating the optimization direction of the competitive task. Furthermore, the proposed adversarial MTL multi-modality framework can jointly incorporate positive and negative task learning with the target task. Combining multi-modality in the adversarial MTL, our model can extract robust signer-independent representation. We evaluate our method on multiple benchmark datasets from different sign languages. The experimental results demonstrate that the proposed adversarial MTL multi-modality model can effectively realize signer-independent SLR by compensation with relevant tasks and competition with irrelevant tasks.
引用
收藏
页码:4380 / 4392
页数:13
相关论文
共 62 条
  • [1] Multi-Task CNN Model for Attribute Prediction
    Abdulnabi, Abrar H.
    Wang, Gang
    Lu, Jiwen
    Jia, Kui
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 1949 - 1959
  • [2] Adaloglou N, 2020, ARXIV 200712530
  • [3] A Comprehensive Study on Deep Learning-Based Methods for Sign Language Recognition
    Adaloglou, Nikolas
    Chatzis, Theocharis
    Papastratis, Ilias
    Stergioulas, Andreas
    Papadopoulos, Georgios Th.
    Zacharopoulou, Vassia
    Xydopoulos, George J.
    Atzakas, Klimnis
    Papazachariou, Dimitris
    Daras, Petros
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1750 - 1762
  • [4] Adi Y, 2019, INT CONF ACOUST SPEE, P3742, DOI 10.1109/ICASSP.2019.8682468
  • [5] [Anonymous], 2015, CVPR, DOI 10.1109/CVPR.2015.7299173
  • [6] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
    Cao, Zhe
    Hidalgo, Gines
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 172 - 186
  • [7] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [8] Multitask learning
    Caruana, R
    [J]. MACHINE LEARNING, 1997, 28 (01) : 41 - 75
  • [9] Chenyang Si, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12352), P35, DOI 10.1007/978-3-030-58571-6_3
  • [10] Cui F, 2021, APPL INTELL, P111