Adversarial multi-task deep learning for signer-independent feature representation

被引：3

作者：

Fang, Yuchun ^{[1
]}

Xiao, Zhengye ^{[1
]}

Cai, Sirui ^{[1
]}

Ni, Lan ^{[2
]}

机构：

[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China

[2] Shanghai Univ, Coll Liberal Arts, Shanghai 200444, Peoples R China

来源：

APPLIED INTELLIGENCE | 2023年 / 53卷 / 04期

基金：

中国国家自然科学基金; 上海市自然科学基金;

关键词：

Sign language recognition; Multi-task learning; Deep learning;

D O I：

10.1007/s10489-022-03649-3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Previous research has achieved remarkable progress in Sign Language Recognition (SLR). However, for robust open-set SLR applications, it is necessary to solve signer-independent SLR. This paper proposes a novel adversarial multi-task deep learning (MTL) framework that can incorporate multiple modalities for isolated SLR. Employing the identity recognition task as the competition task to the target SLR task, the proposed model can effectively extract signer-independent features by deviating the optimization direction of the competitive task. Furthermore, the proposed adversarial MTL multi-modality framework can jointly incorporate positive and negative task learning with the target task. Combining multi-modality in the adversarial MTL, our model can extract robust signer-independent representation. We evaluate our method on multiple benchmark datasets from different sign languages. The experimental results demonstrate that the proposed adversarial MTL multi-modality model can effectively realize signer-independent SLR by compensation with relevant tasks and competition with irrelevant tasks.

引用

页码：4380 / 4392

页数：13

共 62 条

[1] Multi-Task CNN Model for Attribute Prediction
Abdulnabi, Abrar H.
Wang, Gang
Lu, Jiwen
Jia, Kui
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 1949 - 1959
[2] Adaloglou N, 2020, ARXIV 200712530
[3] A Comprehensive Study on Deep Learning-Based Methods for Sign Language Recognition
Adaloglou, Nikolas
Chatzis, Theocharis
Papastratis, Ilias
Stergioulas, Andreas
Papadopoulos, Georgios Th.
Zacharopoulou, Vassia
Xydopoulos, George J.
Atzakas, Klimnis
Papazachariou, Dimitris
Daras, Petros
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1750 - 1762
[4] Adi Y, 2019, INT CONF ACOUST SPEE, P3742, DOI 10.1109/ICASSP.2019.8682468
[5] [Anonymous], 2015, CVPR, DOI 10.1109/CVPR.2015.7299173
[6] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
Cao, Zhe
Hidalgo, Gines
Simon, Tomas
Wei, Shih-En
Sheikh, Yaser
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 172 - 186
[7] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Carreira, Joao
Zisserman, Andrew
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
[8] Multitask learning
Caruana, R
[J]. MACHINE LEARNING, 1997, 28 (01) : 41 - 75
[9] Chenyang Si, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12352), P35, DOI 10.1007/978-3-030-58571-6_3
[10] Cui F, 2021, APPL INTELL, P111

← 1 2 3 4 5 6 7 →