Turkish sign language recognition based on multistream data fusion

被引：8

作者：

Gunduz, Cemil ^{[1
]}

Polat, Huseyin ^{[2
]}

机构：

[1] Gazi Univ, Grad Sch Informat, Dept Informat Syst, Ankara, Turkey

[2] Gazi Univ, Fac Technol, Dept Comp Engn, Ankara, Turkey

来源：

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES | 2021年 / 29卷 / 02期

关键词：

Deep learning; sign language  recognition  3D convolutional  neural  networks  long-short-term memory; recurrent  networks;

D O I：

10.3906/elk-2005-156

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sign languages are nonverbal, visual languages that hearing- or speech-impaired people use for communication. Aside from hands, other communication channels such as body posture and facial expressions are also valuable in sign languages. As a result of the fact that the gestures in sign languages vary across countries, the significance of communication channels in each sign language also differs. In this study, representing the communication channels used in Turkish sign language, a total of 8 different data streams-4 RGB, 3 pose, 1 optical flow-were analyzed. Inception 3D was used for RGB and optical flow; and LSTM-RNN was used for pose data streams. Experiments were conducted by merging the data streams in different combinations, and then a sign language recognition system that merged the most suitable streams with the help of a multistream late fusion mechanism was proposed. Considering each data stream individually, the accuracies of the RGB streams were between 28% and 79%; pose stream accuracies were between 9% and 50%; and optical flow data accuracy was 78.5%. When these data streams were used in combination, the sign language recognition performance was higher in comparison to any of the data streams alone. The proposed sign language recognition system uses a multistream data fusion mechanism and gives an accuracy of 89.3% on BosphorusSign General dataset. The multistream data fusion mechanisms have a great potential for improving sign language recognition results.

引用

页码：1171 / 1186

页数：16

共 40 条

[1] Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training [J].

Abavisani, Mahdi ;

Joze, Hamid Reza Vaezi ;

Patel, Vishal M. .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1165-1174

[2]

[Anonymous], 2011, Visual Analysis of Humans, DOI [DOI 10.1007/978-0-85729-997-0_27, DOI 10.1007/978-0-85729-997-027]

[3]

Buyuksarac B, 2015, SIGN LANGUAGE RECOGN

[4] SubUNets: End-to-end Hand Shape and Continuous Sign Language Recognition [J].

Camgoz, Necati Cihan ;

Hadfield, Simon ;

Koller, Oscar ;

Bowden, Richard .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3075-3084

[5]

Camgöz NC, 2016, LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P1383

[6] Adaptive Neural Sliding Mode Control for Singular Semi-Markovian Jump Systems Against Actuator Attacks [J].

Cao, Zhiru ;

Niu, Yugang ;

Zou, Yuanyuan .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2021, 51 (03) :1523-1533

[7] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].

Carreira, Joao ;

Zisserman, Andrew .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733

[8] Automatic and Efficient Human Pose Estimation for Sign Language Videos [J].

Charles, James ;

Pfister, Tomas ;

Everingham, Mark ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 110 (01) :70-90

[9] A review of hand gesture and sign language recognition techniques [J].

Cheok, Ming Jin ;

Omar, Zaid ;

Jaward, Mohamed Hisham .

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (01) :131-153

[10] American Sign Language Recognition Using Leap Motion Sensor [J].

Chuan, Ching-Hua ;

Regina, Eric ;

Guardino, Caroline .

2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, :541-544

← 1 2 3 4 →