Global-Local Enhancement Network for NMF-Aware Sign Language Recognition

被引：29

作者：

Hu, Hezhen ^{[1
]}

Zhou, Wengang ^{[2
]}

Pu, Junfu ^{[1
]}

Li, Houqiang ^{[2
]}

机构：

[1] Univ Sci & Technol China, Hefei 230027, Peoples R China

[2] Univ Sci & Technol China, Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2021年 / 17卷 / 03期

关键词：

Non-manual features; global-local enhancement network; NMFs-CSL dataset; sign language recognition; FRAMEWORK;

D O I：

10.1145/3436754

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Sign language recognition (SLR) is a challenging problem, involving complex manual features (i.e., hand gestures) and fine-grained non-manual features (NMFs) (i.e., facial expression, mouth shapes, etc.). Although manual features are dominant, non-manual features also play an important role in the expression of a sign word. Specifically, many signwords convey different meanings due to non-manual features, even though they share the same hand gestures. This ambiguity introduces great challenges in the recognition of sign words. To tackle the above issue, we propose a simple yet effective architecture called Global-Local Enhancement Network (GLE-Net), including two mutually promoted streams toward different crucial aspects of SLR. Of the two streams, one captures the global contextual relationship, while the other stream captures the discriminative fine-grained cues. Moreover, due to the lack of datasets explicitly focusing on this kind of feature, we introduce the first non-manual-feature-aware isolated Chinese sign language dataset (NMFs-CSL) with a total vocabulary size of 1,067 sign words in daily life. Extensive experiments on NMFs-CSL and SLR500 datasets demonstrate the effectiveness of our method.

引用

页数：19

共 77 条

[1]

[Anonymous], 2018, ICML

[2]

[Anonymous], 2015, NEURAL INFORM PROCES

[3]

[Anonymous], 2016, P 22 C ARG CIENC COM

[4]

[Anonymous], 2018, IJCAI

[5]

[Anonymous], 2016, LECT NOTES COMP VIII

[6] Exploiting Recurrent Neural Networks and Leap Motion Controller for the Recognition of Sign Language and Semaphoric Hand Gestures [J].

Avola, Danilo ;

Bernardi, Marco ;

Cinque, Luigi ;

Foresti, Gian Luca ;

Massaroni, Cristiano .

IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (01) :234-245

[7]

Buehler P, 2009, PROC CVPR IEEE, P2953, DOI 10.1109/CVPRW.2009.5206523

[8] SubUNets: End-to-end Hand Shape and Continuous Sign Language Recognition [J].

Camgoz, Necati Cihan ;

Hadfield, Simon ;

Koller, Oscar ;

Bowden, Richard .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3075-3084

[9] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].

Carreira, Joao ;

Zisserman, Andrew .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733

[10]

Chai X., 2014, VIPLTR14SLR001

← 1 2 3 4 5 6 7 8 →