A new Kullback-Leibler VAD for speech recognition in noise

被引:55
作者
Ramírez, J [1 ]
Segura, JC [1 ]
Benítez, C [1 ]
de la Torre, A [1 ]
Rubio, AJ [1 ]
机构
[1] Univ Granada, Dept Elect & Tecnol Computadores, Fac Ciencias, E-18071 Granada, Spain
关键词
Kullback-Leibler (KL) divergence; noise reduction; robust speech recognition; voice activity detection (VAD);
D O I
10.1109/LSP.2003.821762
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This letter shows an innovative voice activity detector (VAD) based on the Kullback-Leibler (KL) divergence measure. The algorithm is evaluated in the context of the recently approved ETSI standard for distributed speech recognition (DSR). The VAD uses long-term information of the noisy speech signal in order to define a more robust decision rule yielding high accuracy. The Mel-scaled filter bank log-energies (FBE) are modeled by means of Gaussian distributions, and a. symmetric KL divergence is used for the estimation of the distance between speech and noise distributions. The decision rule is formulated in terms of the average subband KL divergence that is compared to a noise-adaptable threshold. An exhaustive analysis using the AURORA databases is conducted in order to assess the performance of the proposed method and to compare it to existing standard VAD methods.
引用
收藏
页码:266 / 269
页数:4
相关论文
共 14 条
  • [1] [Anonymous], 2001, TEXAS INSTRUMENTS
  • [2] BOUQUINJEANNES RL, 1995, SPEECH COMMUN, V16, P245
  • [3] Mixed decision-based noise adaptation for speech enhancement
    Cho, YD
    Al-Naimi, K
    Kondoz, A
    [J]. ELECTRONICS LETTERS, 2001, 37 (08) : 540 - 542
  • [4] *ETSI, 2000, ETSIES201108
  • [5] *ETSI, 2002, ETSIES202050
  • [6] *ETSI, 1999, ETSIEN301708
  • [7] Gray R. M., 1990, Source Coding Theory
  • [8] HIRSCH HG, 2000, P ISCA ITRW ASR 2000
  • [9] *ITU, 1996, G729 ITU T
  • [10] Madisetti V., 1999, Digital Signal Processing: Handbook