Voice activity detection based on conditional MAP criterion

被引:47
作者
Shin, Jong Won [1 ]
Kwon, Hyuk Jin
Jin, Suk Ho
Kim, Nam Soo
机构
[1] Seoul Natl Univ, Sch Elect Engn, Seoul 151742, South Korea
[2] Seoul Natl Univ, INMC, Seoul 151742, South Korea
关键词
conditional MAP; likelihood ratio test; speech coding; statistical modeling; voice activity detection;
D O I
10.1109/LSP.2008.917027
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this letter, we propose a novel approach to voice activity detection (VAD) based on the modified maximum a posteriori (MAP) criterion conditioned on the voice activity decision made in the previous frame. To exploit the inter-frame correlation of voice activity, the probability of the voice presence conditioned on both the observed spectrum and the voice activity decision in the previous frame is employed instead of the conventional strategy that depends only on the current observation. The proposed conditional MAP criterion incorporating temporal correlations leads to two separate thresholds for the likelihood ratio test (LRT) depending on the previous VAD result. Experimental results show that the VAD based on the proposed conditional MAP criterion outperforms the VAD based on the conventional MAP criterion under various noise environments.
引用
收藏
页码:257 / 260
页数:4
相关论文
共 14 条
  • [1] Voice activity detector employing generalised Gaussian distribution
    Chang, JH
    Shin, JW
    Kim, NS
    [J]. ELECTRONICS LETTERS, 2004, 40 (24) : 1561 - 1563
  • [2] CHANG JH, 2003, P EUR GEN SWITZ, P1065
  • [3] CHO YD, 2001, P IEEE INT C AC SPEE, V2, P7
  • [4] Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold
    Davis, A
    Nordholm, S
    Togneri, R
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02): : 412 - 424
  • [5] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR
    EPHRAIM, Y
    MALAH, D
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06): : 1109 - 1121
  • [6] HAIGH JA, 1993, TENCON'93: 1993 IEEE REGION 10 CONFERENCE ON COMPUTER, COMMUNICATION, CONTROL AND POWER ENGINEERING, VOL 3, P321, DOI 10.1109/TENCON.1993.327987
  • [7] Hoyt J.D., 1994, P IEEE INT C AC SPEE, P237
  • [8] Junqua J. C., 1991, P EUR, P1371
  • [9] Kim NS, 2000, IEEE SIGNAL PROC LET, V7, P108, DOI 10.1109/97.841154
  • [10] Rabiner L., 1977, INT C ACOUSTICS SPEE, V2, P323