Speaking-Rate Adaptation of Automatic Speech Recognition System through Fuzzy Classification based Time-Scale Modification

被引:0
|
作者
Shahnawazuddin, S. [1 ]
Kathania, Hemant K. [2 ]
Adiga, Nagaraj [3 ]
Sai, B. Tarun [1 ]
Ahmad, Waquar [4 ]
机构
[1] NIT Patna, Dept ECE, Patna, Bihar, India
[2] NIT Sikkim, Dept ECE, South Sikkim, India
[3] Univ Crete, Dept CS, Iraklion, Greece
[4] NIT Calicut, Dept ECE, Kozhikode, India
来源
2019 25TH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC) | 2019年
关键词
Speaking-rate adaptation; automatic speech recognition; time-scale modification; fuzzy classification; SIGNALS;
D O I
10.1109/ncc.2019.8732255
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
In this paper, we study the role of speaking-rate adaptation (SRA) of automatic speech recognition (ASR) systems. The performance of an ASR system is reported to degrade when the speaking-rate is either too fast or too slow. In order to simulate such a situation, an ASR system was trained on adults' speech and used for transcribing speech data from adult as well as child speakers. Earlier studies have shown that, speaking-rate is significantly lower in the case of children when compared to adults. Consequently, the recognition performance for children's speech was noted to be very poor in contrast to adults' speech. To improve the recognition performance with respect to children's speech, speaking-rate was explicitly changed using time-scale modification (TSM). A recently proposed TSM approach based on fuzzy classification of spectral bins has been explored in this regard. The fuzzy-classification-based TSM technique is reported to be superior to state-of-the-art approaches. Effectiveness of the said TSM technique has not been studied yet in the context of ASR. The experimental studies presented in this paper show that SRA based on fuzzy classification results in a relative improvement of 30% over the baseline.
引用
收藏
页数:5
相关论文
共 29 条
  • [1] Measure of local speaking-rate for automatic speech recognition
    Russell, MJ
    Ponting, KM
    Tomlinson, MJ
    ELECTRONICS LETTERS, 1999, 35 (10) : 787 - 789
  • [2] Improving Children's Speech Recognition Through Time Scale Modification Based Speaking Rate Adaptation
    Kathania, Hemant K.
    Shahnawazuddin, S.
    Ahmad, Waquar
    Adiga, Nagraj
    Jana, S. K.
    Samaddar, A. B.
    2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018), 2018, : 257 - 261
  • [3] Speaking rate control based on time-scale modification and its effects on the performance of speech recognition
    Kang, Jin Ah
    Choi, Seung Ho
    INTERNATIONAL JOURNAL OF ENGINEERING SYSTEMS MODELLING AND SIMULATION, 2014, 6 (1-2) : 31 - 36
  • [4] Exploring the Role of Speaking-Rate Adaptation on Children's Speech Recognition
    Shahnawazuddin, S.
    Kathania, Hemant K.
    Singh, Chaman
    Ahmad, Waquar
    Pradhan, Gayadhar
    2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018), 2018, : 21 - 25
  • [5] Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition
    Nanjo, H
    Kawahara, T
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 725 - 728
  • [6] Studying the role of pitch-adaptive spectral estimation and speaking-rate normalization in automatic speech recognition
    Shahnawazuddin, S.
    Adiga, Nagaraj
    Kathania, Hemant K.
    Pradhan, Gaydhar
    Sinha, Rohit
    DIGITAL SIGNAL PROCESSING, 2018, 79 : 142 - 151
  • [7] Wavelet speech enhancement based on time-scale adaptation
    Bahoura, Mohammed
    Rouat, Jean
    SPEECH COMMUNICATION, 2006, 48 (12) : 1620 - 1637
  • [8] Approach for time-scale modification of speech based on TCNMF
    Wu, Haijia
    Zhang, Xiongwei
    Huang, Jianjun
    Chen, Weiwei
    ELECTRONICS LETTERS, 2013, 49 (01) : 71 - 72
  • [9] EFFECT OF TIME-SCALE MODIFICATION OF SPEECH ON THE SPEECH RECOGNITION THRESHOLD IN NOISE FOR ELDERLY LISTENERS
    STOLLMAN, MHP
    KAPTEYN, TS
    AUDIOLOGY, 1994, 33 (05): : 280 - 290
  • [10] Model Adaptation for Automatic Speech Recognition Based on Multiple Time Scale Evolution
    Watanabe, Shinji
    Nakamura, Atsushi
    Juang, Biing-Hwang
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1088 - +