Adversarial multi-task deep learning for signer-independent feature representation

被引：3

作者：

Fang, Yuchun ^{[1
]}

Xiao, Zhengye ^{[1
]}

Cai, Sirui ^{[1
]}

Ni, Lan ^{[2
]}

机构：

[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China

[2] Shanghai Univ, Coll Liberal Arts, Shanghai 200444, Peoples R China

来源：

APPLIED INTELLIGENCE | 2023年 / 53卷 / 04期

基金：

中国国家自然科学基金; 上海市自然科学基金;

关键词：

Sign language recognition; Multi-task learning; Deep learning;

D O I：

10.1007/s10489-022-03649-3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Previous research has achieved remarkable progress in Sign Language Recognition (SLR). However, for robust open-set SLR applications, it is necessary to solve signer-independent SLR. This paper proposes a novel adversarial multi-task deep learning (MTL) framework that can incorporate multiple modalities for isolated SLR. Employing the identity recognition task as the competition task to the target SLR task, the proposed model can effectively extract signer-independent features by deviating the optimization direction of the competitive task. Furthermore, the proposed adversarial MTL multi-modality framework can jointly incorporate positive and negative task learning with the target task. Combining multi-modality in the adversarial MTL, our model can extract robust signer-independent representation. We evaluate our method on multiple benchmark datasets from different sign languages. The experimental results demonstrate that the proposed adversarial MTL multi-modality model can effectively realize signer-independent SLR by compensation with relevant tasks and competition with irrelevant tasks.

引用

页码：4380 / 4392

页数：13

共 62 条

[31] Research on Heavy Haul Train Protection Algorithm Based on Online Parameter Identification
Liu, Yu
Wei, Guodong
Qiao, Zheng
[J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
[32] Meng Z, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5949, DOI 10.1109/ICASSP.2018.8461682
[33] Mullick K, 2017, IEEE IMAGE PROC, P3998, DOI 10.1109/ICIP.2017.8297033
[34] Iterative Alignment Network for Continuous Sign Language Recognition
Pu, Junfu
Zhou, Wengang
Li, Houqiang
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4160 - 4169
[35] Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
Qiu, Zhaofan
Yao, Ting
Mei, Tao
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5534 - 5542
[36] Sign Language Recognition: A Deep Survey
Rastgoo, Razieh
Kiani, Kourosh
Escalera, Sergio
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 164
[37] Romera-Paredes B., 2012, AISTATS
[38] Adversarial Multi-task Learning of Deep Neural Networks for Robust Speech Recognition
Shinohara, Yusuke
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2369 - 2372
[39] Song LX, 2018, AAAI CONF ARTIF INTE, P7355
[40] Gate-Shift Networks for Video Action Recognition
Sudhakaran, Swathikiran
Escalera, Sergio
Lanz, Oswald
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1099 - 1108

← 1 2 3 4 5 6 7 →