Enhancing Pitch Robustness of Speech Recognition System through Spectral Smoothing

被引:0
|
作者
Sai, B. Tarun [1 ]
Yadav, Ishwar Chandra [1 ]
Shahnawazuddin, S. [1 ]
Pradhan, Gayadhar [1 ]
机构
[1] Natl Inst Technol Patna, Dept Elect & Commun Engn, Patna, Bihar, India
来源
2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018) | 2018年
关键词
Speech recognition; pitch mismatch; spectral smoothing; modified EMD; CHILDRENS SPEECH; DECOMPOSITION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we present a novel approach for front-end speech parameterization that is more robust towards pitch variations than the most commonly used technique. Earlier works have shown that, insufficient smoothing of magnitude spectrum leads to pitch-induced distortions. This, in turn, results in poor performance of speech recognition system especially for high-pitched child speakers. To overcome this shortcoming, the short-time magnitude spectrum is first decomposed into several components using a modified version of empirical mode decomposition (EMD). Next, the lowest-order component is discarded and the spectrum is reconstructed using the rest of the higher-order modes for sufficiently smoothing the spectrum. The Mel-frequency cepstral coefficients (MFCC) are then extracted using the smoothed spectra. The signal domain analyses presented in this paper demonstrate that the ill-effects of pitch variations get significantly reduced by the inclusion of proposed spectral smoothing module. In order to statistically validate the same, an automatic speech recognition system is developed using speech data from adult speakers. To simulate large pitch differences, evaluations are performed on a test set which consists of speech data from child speakers. Inclusion of proposed spectral smoothing module leads to a relative improvement of 12% over the baseline system employing acoustic modeling based on deep neural network.
引用
收藏
页码:242 / 246
页数:5
相关论文
共 50 条
  • [41] Speech Recognition System of Specific Vocabulary
    Zhao, Sitian
    2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 120 - 124
  • [42] Proposal of an Intelligent Speech Recognition System
    Santos Silva, Washington Luis
    de Oliveira Serra, Ginalber Luiz
    2012 THIRD GLOBAL CONGRESS ON INTELLIGENT SYSTEMS (GCIS 2012), 2012, : 356 - 359
  • [43] Speech recognition for a travel reservation system
    Erdogan, H
    IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 1505 - 1511
  • [44] BUT OpenSAT 2017 speech recognition system
    Karafiat, Martin
    Baskar, Murali Karthick
    Szoke, Igor
    Malenovsky, Vladimir
    Vesely, Karel
    Grezl, Frantisek
    Burget, Lukas
    Cernocky, Jan Honza
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2638 - 2642
  • [45] The AhoSR Automatic Speech Recognition System
    Odriozola, Igor
    Serrano, Luis
    Hernaez, Inma
    Navas, Eva
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2014, 2014, 8854 : 279 - 288
  • [46] SR-MT: A Metamorphic Method to Test the Robustness of Speech Recognition Software
    Wang, Feifei
    Ben, Kerong
    Zhang, Xian
    2022 IEEE/ACM 7TH INTERNATIONAL WORKSHOP ON METAMORPHIC TESTING (MET 2022), 2022, : 15 - 22
  • [47] Use of a speech recognition system in neuroradiology
    Aprile, I
    Tommasini, G
    Iaiza, F
    Biasizzo, E
    Lavaroni, A
    DAgostini, S
    Fabris, G
    RIVISTA DI NEURORADIOLOGIA, 1997, 10 : 251 - 252
  • [48] Speech Recognition Interactive System for Vehicle
    Loh, Chee Yang
    Boey, Kai Lung
    Hong, Kai Sze
    2017 IEEE 13TH INTERNATIONAL COLLOQUIUM ON SIGNAL PROCESSING & ITS APPLICATIONS (CSPA), 2017, : 85 - 88
  • [49] Assessment of Pepper Robot's Speech Recognition System through the Lens of Machine Learning
    Pande, Akshara
    Mishra, Deepti
    BIOMIMETICS, 2024, 9 (07)
  • [50] Uncertainty estimation for a speech recognition system
    Morales-Munoz, Walter
    Calderon-Ramirez, Saul
    TECNOLOGIA EN MARCHA, 2024, 37 : 97 - 103