Discriminative weight training for a statistical model-based voice activity detection

被引:31
作者
Kang, Sang-Ick [1 ]
Jo, Q-Haing [1 ]
Chang, Joon-Hyuk [1 ]
机构
[1] Inha Univ, Sch Elect Engn, Inchon 402751, South Korea
关键词
likelihood ratio; minimum classification error; statistical model; voice activity detection;
D O I
10.1109/LSP.2007.913595
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this letter, we apply a discriminative weight training to a statistical model-based voice activity detection (VAD). In our approach, the VAD decision rule is expressed as the geometric mean of optimally weighted likelihood ratios (LRs) based on a minimum classification error (MCE) method. That approach is different from that of previous works in that different weights are assigned to each frequency bin and is considered to be more realistic. According to the experimental results, the proposed approach is found to be effective for the statistical model-based VAD using the LR test.
引用
收藏
页码:170 / 173
页数:4
相关论文
共 10 条
[1]   Voice activity detector employing generalised Gaussian distribution [J].
Chang, JH ;
Shin, JW ;
Kim, NS .
ELECTRONICS LETTERS, 2004, 40 (24) :1561-1563
[2]   Voice activity detection based on multiple statistical models [J].
Chang, Joon-Hyuk ;
Kim, Nam Soo ;
Mitra, Sanjit K. .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (06) :1965-1976
[3]  
Cho YD, 2001, IEEE SIGNAL PROC LET, V8, P276, DOI 10.1109/97.957270
[4]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121
[5]   Minimum classification error rate methods for speech recognition [J].
Juang, BH ;
Chou, W ;
Lee, CH .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (03) :257-265
[6]  
KIDA Y, 2005, P INTERSPEECH, P2621
[7]   Statistical model-based VAD algorithm with wavelet transform [J].
Lee, Yoon-Chang ;
Ahn, Sang-Sik .
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2006, E89A (06) :1594-1600
[8]   Speech/non-speech discrimination based on contextual information integrated bispectrum LRT [J].
Ramirez, Javier ;
Gorriz, Juan Manuel ;
Segura, Jose Carlos ;
Puntonet, Carlos G. ;
Rubio, Antonio J. .
IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (08) :497-500
[9]   A statistical model-based voice activity detection [J].
Sohn, J ;
Kim, NS ;
Sung, W .
IEEE SIGNAL PROCESSING LETTERS, 1999, 6 (01) :1-3
[10]  
Sohn J, 1998, INT CONF ACOUST SPEE, P365, DOI 10.1109/ICASSP.1998.674443