Real-Time Speech-to-Text Holographic Communication for the Deaf Children and Elderly

被引：0

作者：

Charan, Sreevathsa Sree ^{[1
]}

Srihitha, Vemula ^{[1
]}

Palaniswamy, Suja ^{[1
]}

Chethana, Savarala ^{[1
]}

机构：

[1] Amrita Vishwa Vidyapeetham, Dept Comp Sci & Engn, Amrita Sch Comp, Bengaluru, India

来源：

2024 FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA PROCESSING, COMMUNICATION & INFORMATION TECHNOLOGY, MPCIT | 2024年

关键词：

Automatic Speech Recognition; Hologram; Speech-to-Text; Wav2Vec2;

D O I：

10.1109/MPCIT62449.2024.10892690

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the field of assistive technologies, innovative solutions that cater to the needs of individual with disabilities are paramount for fostering inclusive communication environments. The proposed work introduces an innovative real-time speech-to-text conversion system using holographic display technology to aid communication for the deaf and hard-of-hearing (DHH). The system incorporates two complementary speech recognition approaches as each contribute uniquely to improve the system's overall performance and accuracy in different ways and also reduce the latency. First approach is a conventional speech recognition module and another is an enhanced version of the Wav2vec2 model, which has been retrained to improve its efficacy with diverse speech patterns. Hosted on a Raspberry Pi and interfaced with an USB microphone, the system captures spoken language effectively. Upon recognition, the text is instantly displayed on a holographic screen made up of LED matrices and set against an acrylic glass surface, creating a visually engaging representation of floating text. The integration of these two recognition methods enhances the system's accuracy and reliability in processing real-time speech. By providing a visual output of spoken words, this system not only bridges communication gaps but also enriches the interaction experience within various social and professional settings. The application of holographic technology in this context is pioneering, offering substantial promise for developing accessible communication tools that are both functional and inclusive. This work underscores the potential of advanced display and speech processing technologies to transform interactions for individuals with hearing impairments, fostering greater inclusivity and engagement in everyday communications.

引用

页码：327 / 331

页数：5

共 19 条

[1]

[Anonymous], About us

[2]

Bai Yunling, 2022, Highlights in Science, Engineering and Technology, V24, P119

[3]

Bansal S, 2017, Arxiv, DOI arXiv:1702.03856

[4]

Biswas Dipshikha, 2022, 2022 IEEE 7 INT C CO, P1

[5] Exploration of Automatic Speech Recognition for Deaf and Hard of Hearing Students in Higher Education Classes [J].

Butler, Janine ;

Trager, Brian ;

Behm, Byron .

ASSETS'19: THE 21ST INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2019, :32-42

[6]

Fazil Shaik Huzaifa, 2023, 2023 14 INT C COMP C, P1

[7]

Ganesh B., International Journal of Medical Engineering and Informatics, V14

[8]

Hannun A, 2014, Arxiv, DOI arXiv:1412.5567

[9]

Hasan H. M. Mahmudul, 2020, 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), P1124, DOI 10.1109/ICSSIT48917.2020.9214205

[10]

Irdamurni Irdamurni, 2019, P 3 WORKSH MULT ITS

← 1 2 →