Speaking-Rate Adaptation of Automatic Speech Recognition System through Fuzzy Classification based Time-Scale Modification

被引:0
|
作者
Shahnawazuddin, S. [1 ]
Kathania, Hemant K. [2 ]
Adiga, Nagaraj [3 ]
Sai, B. Tarun [1 ]
Ahmad, Waquar [4 ]
机构
[1] NIT Patna, Dept ECE, Patna, Bihar, India
[2] NIT Sikkim, Dept ECE, South Sikkim, India
[3] Univ Crete, Dept CS, Iraklion, Greece
[4] NIT Calicut, Dept ECE, Kozhikode, India
来源
2019 25TH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC) | 2019年
关键词
Speaking-rate adaptation; automatic speech recognition; time-scale modification; fuzzy classification; SIGNALS;
D O I
10.1109/ncc.2019.8732255
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
In this paper, we study the role of speaking-rate adaptation (SRA) of automatic speech recognition (ASR) systems. The performance of an ASR system is reported to degrade when the speaking-rate is either too fast or too slow. In order to simulate such a situation, an ASR system was trained on adults' speech and used for transcribing speech data from adult as well as child speakers. Earlier studies have shown that, speaking-rate is significantly lower in the case of children when compared to adults. Consequently, the recognition performance for children's speech was noted to be very poor in contrast to adults' speech. To improve the recognition performance with respect to children's speech, speaking-rate was explicitly changed using time-scale modification (TSM). A recently proposed TSM approach based on fuzzy classification of spectral bins has been explored in this regard. The fuzzy-classification-based TSM technique is reported to be superior to state-of-the-art approaches. Effectiveness of the said TSM technique has not been studied yet in the context of ASR. The experimental studies presented in this paper show that SRA based on fuzzy classification results in a relative improvement of 30% over the baseline.
引用
收藏
页数:5
相关论文
共 29 条
  • [21] Performance enhancement of syllable based Tamil speech recognition system using time normalization and rate of speech
    A. Akila
    E. Chandra
    CSI Transactions on ICT, 2014, 2 (2) : 77 - 84
  • [22] Robust Automatic Speech Recognition System Based on Using Adaptive Time-Frequency Masking
    Gouda, Ahmed Mostafa
    Tamazin, Mohamed
    Khedr, Mohamed
    PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 181 - 186
  • [23] A Fuzzy Ontology driven Context Classification System using Large-Scale Image Recognition based on Deep CNN
    Edris, Saba S.
    Zarka, Mohamed
    Ouarda, Wael
    Alimi, Adel M.
    PROCEEDINGS OF 2017 SUDAN CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (SCCSIT), 2017, : 43 - 51
  • [24] Developing speaker independent ASR system using limited data through prosody modification based on fuzzy classification of spectral bins
    Shahnawazuddin, S.
    Adiga, Nagaraj
    Sai, B. Tarun
    Ahmad, Waquar
    Kathania, Hemant K.
    DIGITAL SIGNAL PROCESSING, 2019, 93 : 34 - 42
  • [25] Time scale modification and vocal tract length normalization for improving the performance of Tamil speech recognition system implemented using language independent segmentation algorithm
    Saraswathi, S.
    Geetha, T. V.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2006, 9 (3-4) : 151 - 163
  • [26] Automatic Multi-Speaker Speech Recognition System Based on Time-Frequency Blind Source Separation under Ubiquitous Environment
    Wang, Zhe
    Zhang, Haijian
    Bi, Guoan
    Li, Xiumei
    PROCEEDINGS OF THE 2014 9TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2014, : 101 - +
  • [27] Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR
    Adeleh Asemi
    Siti Salwah Binti Salim
    Seyed Reza Shahamiri
    Asefeh Asemi
    Narjes Houshangi
    Soft Computing, 2019, 23 : 3529 - 3544
  • [28] Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR
    Asemi, Adeleh
    Salim, Siti Salwah Binti
    Shahamiri, Seyed Reza
    Asemi, Asefeh
    Houshangi, Narjes
    SOFT COMPUTING, 2019, 23 (10) : 3529 - 3544
  • [29] Implementation of a Whisper Architecture-Based Turkish Automatic Speech Recognition (ASR) System and Evaluation of the Effect of Fine-Tuning with a Low-Rank Adaptation (LoRA) Adapter on Its Performance
    Polat, Hueseyin
    Turan, Alp Kaan
    Kocak, Cemal
    Ulas, Hasan Basri
    ELECTRONICS, 2024, 13 (21)