Efficient SNR Driven SPLICE Implementation for Robust Speech Recognition

被引:0
|
作者
Squartini, Stefano [1 ]
Principi, Emanuele [1 ]
Cifani, Simone [1 ]
Rotili, Rudi [1 ]
Piazza, Francesco [1 ]
机构
[1] Univ Politecn Marche, DIBET, MediaLabs3, Ancona, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The SPLICE algorithm has been recently proposed in the literature to address the robustness issue in Automatic Speech Recognition (ASR). Several variants have been also proposed to improve some drawbacks of the original technique. In this presentation an innovative efficient solution is discussed: it is based on SNR estimation in the frequency or mel domain and investigates the possibility of using different noise types for GMM training in order to maximize the generalization capabilities of the tool and therefore the recognition performances in presence of unknown noise sources. Computer simulations, conducted on the AURORA2 database, seem to confirm the effectiveness of the idea: the proposed approach yields similar accuracy performances w.r.t. the reference one, even employing a simpler mismatch compensation paradigm which does not need any a-priori knowledge on the noises used in the training phase.
引用
收藏
页码:70 / 80
页数:11
相关论文
共 50 条
  • [21] EFFICIENT VQ-BASED MMSE ESTIMATION FOR ROBUST SPEECH RECOGNITION
    Gonzalez, Jose A.
    Peinado, Antonio M.
    Gomez, Angel M.
    Carmona, Jose L.
    Morales-Cordovilla, Juan A.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4558 - 4561
  • [22] An Efficient and Noise-Robust Audiovisual Encoder for Audiovisual Speech Recognition
    Li, Zhengyang
    Liang, Chenwei
    Lohrenz, Timo
    Sach, Marvin
    Moeller, Bjoern
    Fingscheidt, Tim
    INTERSPEECH 2023, 2023, : 1583 - 1587
  • [23] Efficient MMSE Estimation and Uncertainty Processing for Multienvironment Robust Speech Recognition
    Gonzalez, Jose A.
    Peinado, Antonio M.
    Gomez, Angel M.
    Carmona, Jose L.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1206 - 1220
  • [24] On a Classification of Voiced/Unvoiced by using SNR for Speech Recognition
    Kim, Jongkuk
    Hahn, Hernsoo
    PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND ELECTRONICS INFORMATION (ICACSEI 2013), 2013, 41 : 472 - 476
  • [25] Speech Recognition Based on Efficient DTW Algorithm and Its DSP Implementation
    Jing XinXing
    Shi Xu
    2012 INTERNATIONAL WORKSHOP ON INFORMATION AND ELECTRONICS ENGINEERING, 2012, 29 : 832 - 836
  • [26] VLSI Architecture for Robust Speech Recognition Systems and its Implementation on a Verification Platform
    Yoshizawa, Shingo
    Hayasaka, Noboru
    Wada, Naoya
    Miyanaga, Yoshikazu
    JOURNAL OF ROBOTICS AND MECHATRONICS, 2005, 17 (04) : 447 - 455
  • [27] Label Driven Time-Frequency Masking for Robust Continuous Speech Recognition
    Soni, Meet
    Panda, Ashish
    INTERSPEECH 2019, 2019, : 426 - 430
  • [28] Fast HMM-driven Beamforming for Robust Speech Recognition in Reverberant Environments
    Hong, Wei-Tyng
    PROCEEDINGS OF 2014 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 2, 2014, : 529 - 532
  • [29] Speech parameters for the robust emotional speech recognition
    Kim W.-G.
    Journal of Institute of Control, Robotics and Systems, 2010, 16 (12) : 1137 - 1142
  • [30] Robust recognition of fast speech
    Lee, Ki-Seung
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (08) : 2456 - 2459