Efficient SNR Driven SPLICE Implementation for Robust Speech Recognition

被引：0

作者：

Squartini, Stefano ^{[1
]}

Principi, Emanuele ^{[1
]}

Cifani, Simone ^{[1
]}

Rotili, Rudi ^{[1
]}

Piazza, Francesco ^{[1
]}

机构：

[1] Univ Politecn Marche, DIBET, MediaLabs3, Ancona, Italy

来源：

ANALYSIS OF VERBAL AND NONVERBAL COMMUNICATION AND ENACTMENT: THE PROCESSING ISSUES | 2011年 / 6800卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The SPLICE algorithm has been recently proposed in the literature to address the robustness issue in Automatic Speech Recognition (ASR). Several variants have been also proposed to improve some drawbacks of the original technique. In this presentation an innovative efficient solution is discussed: it is based on SNR estimation in the frequency or mel domain and investigates the possibility of using different noise types for GMM training in order to maximize the generalization capabilities of the tool and therefore the recognition performances in presence of unknown noise sources. Computer simulations, conducted on the AURORA2 database, seem to confirm the effectiveness of the idea: the proposed approach yields similar accuracy performances w.r.t. the reference one, even employing a simpler mismatch compensation paradigm which does not need any a-priori knowledge on the noises used in the training phase.

引用

页码：70 / 80

页数：11

共 50 条

[21] EFFICIENT VQ-BASED MMSE ESTIMATION FOR ROBUST SPEECH RECOGNITION
Gonzalez, Jose A.
Peinado, Antonio M.
Gomez, Angel M.
Carmona, Jose L.
Morales-Cordovilla, Juan A.
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4558 - 4561
[22] An Efficient and Noise-Robust Audiovisual Encoder for Audiovisual Speech Recognition
Li, Zhengyang
Liang, Chenwei
Lohrenz, Timo
Sach, Marvin
Moeller, Bjoern
Fingscheidt, Tim
INTERSPEECH 2023, 2023, : 1583 - 1587
[23] Efficient MMSE Estimation and Uncertainty Processing for Multienvironment Robust Speech Recognition
Gonzalez, Jose A.
Peinado, Antonio M.
Gomez, Angel M.
Carmona, Jose L.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1206 - 1220
[24] On a Classification of Voiced/Unvoiced by using SNR for Speech Recognition
Kim, Jongkuk
Hahn, Hernsoo
PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND ELECTRONICS INFORMATION (ICACSEI 2013), 2013, 41 : 472 - 476
[25] Speech Recognition Based on Efficient DTW Algorithm and Its DSP Implementation
Jing XinXing
Shi Xu
2012 INTERNATIONAL WORKSHOP ON INFORMATION AND ELECTRONICS ENGINEERING, 2012, 29 : 832 - 836
[26] VLSI Architecture for Robust Speech Recognition Systems and its Implementation on a Verification Platform
Yoshizawa, Shingo
Hayasaka, Noboru
Wada, Naoya
Miyanaga, Yoshikazu
JOURNAL OF ROBOTICS AND MECHATRONICS, 2005, 17 (04) : 447 - 455
[27] Label Driven Time-Frequency Masking for Robust Continuous Speech Recognition
Soni, Meet
Panda, Ashish
INTERSPEECH 2019, 2019, : 426 - 430
[28] Fast HMM-driven Beamforming for Robust Speech Recognition in Reverberant Environments
Hong, Wei-Tyng
PROCEEDINGS OF 2014 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 2, 2014, : 529 - 532
[29] Speech parameters for the robust emotional speech recognition
Kim W.-G.
Journal of Institute of Control, Robotics and Systems, 2010, 16 (12) : 1137 - 1142
[30] Robust recognition of fast speech
Lee, Ki-Seung
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (08) : 2456 - 2459

← 1 2 3 4 5 →