Channel selection measures for multi-microphone speech recognition

被引:38
作者
Wolf, Martin [1 ]
Nadeu, Climent [1 ]
机构
[1] Univ Politecn Cataluna, TALP Res Ctr, Dept Signal Theory & Commun, ES-08034 Barcelona, Spain
关键词
Automatic speech recognition; Channel (microphone) selection; Signal quality; Multi-microphone; Reverberation;
D O I
10.1016/j.specom.2013.09.015
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic speech recognition in a room with distant microphones is strongly affected by noise and reverberation. In scenarios where the speech signal is captured by several arbitrarily located microphones the degree of distortion differs from one channel to another. In this work we deal with measures extracted from a given distorted signal that either estimate its quality or measure how well it fits the acoustic models of the recognition system. We then apply them to solve the problem of selecting the signal (i.e. the channel) that presumably leads to the lowest recognition error rate. New channel. selection techniques are presented, and compared experimentally in reverberant environments with other approaches reported in the literature. Significant improvements in recognition rate are observed for most of the measures. A new measure based on the variance of the speech intensity envelope shows a good trade-off between recognition accuracy, latency and computational cost. Also, the combination of measures allows a further improvement in recognition rate. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:170 / 180
页数:11
相关论文
共 26 条
[1]  
[Anonymous], P EUR SIGN PROC C EU
[2]  
[Anonymous], P INT ANN C INT SPEE
[3]  
[Anonymous], P WORKSH SPEECH LANG
[4]  
[Anonymous], 2009, Distant Speech Recognition
[5]   EFFECTIVENESS OF LINEAR PREDICTION CHARACTERISTICS OF SPEECH WAVE FOR AUTOMATIC SPEAKER IDENTIFICATION AND VERIFICATION [J].
ATAL, BS .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (06) :1304-1312
[6]  
Brandstein M, 2001, DIGITAL SIGNAL PROC, P133
[7]  
de la Torre A., 2002, P ICASSP
[8]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[9]   PERCEPTUAL LINEAR PREDICTIVE (PLP) ANALYSIS OF SPEECH [J].
HERMANSKY, H .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1990, 87 (04) :1738-1752
[10]   A REVIEW OF THE MTF CONCEPT IN ROOM ACOUSTICS AND ITS USE FOR ESTIMATING SPEECH-INTELLIGIBILITY IN AUDITORIA [J].
HOUTGAST, T ;
STEENEKEN, HJM .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1985, 77 (03) :1069-1077