Multi-View Bangla Sign Language Recognition: A New Word-Level Video Dataset

被引:0
作者
Islam, Md Shamimul [1 ]
Joha, A. J. M. Akhtarujjaman [2 ]
Hossain, Md Nur [3 ]
Abdullah, Sohaib [4 ]
Elwarfalli, Ibrahim [5 ]
Hasan, Md Mahedi [6 ]
Khan, Muhammad Imran [6 ]
机构
[1] Asian Univ Bangladesh AUB, Dept CSE, Dhaka, Bangladesh
[2] Bangladesh Univ Professionals, Dept ICT, Dhaka, Bangladesh
[3] Manarat Int Univ, Dept CSE, Dhaka, Bangladesh
[4] AUB, Dept CSE, Dhaka, Bangladesh
[5] West Virginia Univ, Morgantown, WV USA
[6] BUET, IICT, Dhaka, Bangladesh
来源
2024 INTERNATIONAL CONFERENCE ON ACTIVITY AND BEHAVIOR COMPUTING, ABC 2024 | 2024年
关键词
Bangla Sign Language Recognition; Dataset; Attention; Bi-GRU;
D O I
10.1109/ABC61795.2024.10652018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sign language recognition is essential for overcoming communication barriers, especially for those with verbal challenges. However, recognizing sign language presents various challenges, including shared gestures, lighting, bodily poses, and environmental differences. The scarcity of a comprehensive Bangla sign language video dataset exacerbates these challenges, especially for deep learning-based algorithms. To address this gap, we develop the MVBSL-W50, a multi-view Bangla sign language dataset encompassing 50 lexically isolated signs across 13 categories. We also design a model based on human pose information, achieving an 89.69% accuracy. We conduct experiments to evaluate the model's performance against angular variations and lighting conditions, emphasizing its robustness and applicability in real-world settings. Our model is further evaluated on the Indian Lexicon Sign Language Dataset (INCLUDE), where it achieve an accuracy of 96.60%. This significant improvement underscores the effectiveness of our approach in Bangla Sign Language (BSL).
引用
收藏
页数:6
相关论文
共 26 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Agris Ulrich von, 2010, SIGNUM Database: Video corpus for signer-independent continuous sign language recognition
  • [3] Bintey Hoque Oishee, 2021, Computer Vision - ACCV 2020 Workshops. 15th Asian Conference on Computer Vision. Revised Selected Papers. Lecture Notes in Computer Science (LNCS 12628), P71, DOI 10.1007/978-3-030-69756-3_6
  • [4] Cho KYHY, 2014, Arxiv, DOI [arXiv:1406.1078, DOI 10.48550/ARXIV.1406.1078]
  • [5] Deb Kaushik, 2012, Global journal of computer science and technology, V12
  • [6] Dewanjee Tanmoy, Recognition of Bangladeshi sign language from 2D videos using OpenPose and LSTM based RNN
  • [7] How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
    Duarte, Amanda
    Palaskar, Shruti
    Ventura, Lucas
    Ghadiyaram, Deepti
    DeHaan, Kenneth
    Metze, Florian
    Torres, Jordi
    Giro-i-Nieto, Xavier
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2734 - 2743
  • [8] Glorot X., 2010, P 13 INT C ART INT S, P249
  • [9] GOOD IJ, 1952, J ROY STAT SOC B, V14, P107
  • [10] Google, MediaPipe: Cross-platform, customizable ML solutions for live and streaming media