Pakistan sign language recognition: leveraging deep learning models with limited dataset

被引:0
作者
Hafiz Muhammad Hamza
Aamir Wali
机构
[1] National University of Computer and Emerging Science,FAST School of Computing
来源
Machine Vision and Applications | 2023年 / 34卷
关键词
Sign language recognition; Pakistan Sign Language; PSL data dictionary; C3D; Data augmentation;
D O I
暂无
中图分类号
学科分类号
摘要
Sign language is the predominant form of communication among a large group of society. The nature of sign languages is visual. This makes them very different from spoken languages. Unfortunately, very few able people can understand sign language making communication with the hearing-impaired extremely difficult. Research in the field of sign language recognition can help reduce the barrier between deaf and able people. A lot of work has been done on sign language recognition for numerous languages such as American sign language and Chinese sign language. Unfortunately, very little to no work has been done for Pakistan Sign Language. Any contribution in Pakistan Sign Language recognition is limited to static images instead of gestures. Furthermore, the dataset available for this language is very small in terms of the number of examples per word which makes it very difficult to train deep networks that require a considerable amount of training data. Data Augmentation techniques help the network generalize better by providing more variety in the training data. In this paper, a pipeline for the Pakistan Sign Language recognition system is proposed that incorporates an augmentation unit. To validate the effectiveness of the proposed pipeline, three deep learning models, C3D, I3D, and TSM are used. Results show that translation and rotation are the two best augmentation techniques for the Pakistan Sign Language dataset. The models trained using our data-augment-supported pipeline outperform other methods that only used the original data. The most suitable model is C3D which not only produced an accuracy of 93.33% but also has a low training time as compared to other models.
引用
收藏
相关论文
共 50 条
[31]   Mexican Sign Language Recognition: Dataset Creation and Performance Evaluation Using MediaPipe and Machine Learning Techniques [J].
Rodriguez, Mario ;
Oubram, Outmane ;
Bassam, A. ;
Lakouari, Noureddine ;
Tariq, Rasikh .
ELECTRONICS, 2025, 14 (07)
[32]   Sign Language Recognition using Improved Seagull Optimization Algorithm with Deep Learning Model [J].
Sivaraman, R. ;
Santiago, S. ;
Chinnathambi, K. ;
Sarkar, Swagata ;
Sangeethaa, S. N. ;
Srimathi, S. .
2024 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT CYBER PHYSICAL SYSTEMS AND INTERNET OF THINGS, ICOICI 2024, 2024, :1566-1571
[33]   Sign Language Recognition: A Comprehensive Review of Traditional and Deep Learning Approaches, Datasets, and Challenges [J].
Tao, Tangfei ;
Zhao, Yizhe ;
Liu, Tianyu ;
Zhu, Jieli .
IEEE ACCESS, 2024, 12 :75034-75060
[34]   Empowering Communication: A Deep Learning Framework for Arabic Sign Language Recognition with an Attention Mechanism [J].
Ameer, R. S. Abdul ;
Ahmed, M. A. ;
Al-Qaysi, Z. T. ;
Salih, M. M. ;
Shuwandy, Moceheb Lazam .
COMPUTERS, 2024, 13 (06)
[35]   On the role of multimodal learning in the recognition of sign language [J].
Pedro M. Ferreira ;
Jaime S. Cardoso ;
Ana Rebelo .
Multimedia Tools and Applications, 2019, 78 :10035-10056
[36]   Sign language recognition based on concept learning [J].
Ma, Xiang ;
Yuan, Lin ;
Wen, Ruoshi ;
Wang, Qiang .
2020 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE, I2MTC 2020, 2020,
[37]   Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition [J].
Borg, Mark ;
Camilleri, Kenneth P. .
COMPUTER VISION - ECCV 2020 WORKSHOPS, PT II, 2020, 12536 :199-217
[38]   On the role of multimodal learning in the recognition of sign language [J].
Ferreira, Pedro M. ;
Cardoso, Jaime S. ;
Rebelo, Ana .
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (08) :10035-10056
[39]   Deep Learning-Based Sign Language Recognition for Hearing and Speaking Impaired People [J].
Alnfiai, Mrim M. .
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (02) :1653-1669
[40]   KU-BdSL: An open dataset for Bengali sign language recognition [J].
Jim, Abdullah Al Jaid ;
Rafi, Ibrahim ;
Akon, Md. Zahid ;
Biswas, Uzzal ;
Nahid, Abdullah-Al .
DATA IN BRIEF, 2023, 51