A NEW APPROACH FOR ROBUST REALTIME VOICE ACTIVITY DETECTION USING SPECTRAL PATTERN

被引:18
作者
Moattar, M. H. [1 ]
Homayounpour, M. M. [1 ]
Kalantari, Nima Khademi [2 ]
机构
[1] Amirkabir Univ Technol, Comp Engn & Informat Technol Dept, Lab Intelligent Signal & Speech Proc, Tehran, Iran
[2] Amirkabir Univ Technol, Dept Elect Engn, Tehran, Iran
来源
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年
关键词
Voice Activity Detection; Spectral Peaks Pattern; Spectral Flatness;
D O I
10.1109/ICASSP.2010.5495597
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper a Voice Activity Detection approach is proposed which applies a voting algorithm to decide on the existence of speech in audio signal. For this purpose, the proposed approach uses three different short time features along with the pattern of spectral peaks of every frame. Spectral peaks pattern is appropriate for determining vowel sounds in speech signal even in the presence of noise. Therefore this measure can be applicable in voice activity detection in which the vowels characterize the speech signal. Experiments show that incorporating this measure along with our recently proposed approach for VAD, will improve the results of the algorithm considerably while imposing little computational overhead. The proposed approach is evaluated on different datasets with various noises and SNR levels and satisfying results are achieved.
引用
收藏
页码:4478 / 4481
页数:4
相关论文
共 12 条
[1]  
[Anonymous], 2005, Entropy
[2]   ITU-T recommendation G.729 Annex B: A silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications [J].
Benyassine, A ;
Shlomot, E ;
Su, HY ;
Massaloux, D ;
Lamblin, C ;
Petit, JP .
IEEE COMMUNICATIONS MAGAZINE, 1997, 35 (09) :64-73
[3]   Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold [J].
Davis, A ;
Nordholm, S ;
Togneri, R .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02) :412-424
[4]  
Lee B., 2007, P BIENN DSP INVEHICL
[5]   An improved voice activity detection using higher order statistics [J].
Li, K ;
Swamy, MNS ;
Ahmad, MO .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05) :965-974
[6]   Speech pause detection for noise spectrum estimation by tracking power envelope dynamics [J].
Marzinzik, M ;
Kollmeier, B .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (02) :109-118
[7]  
Moattar M. H., 2009, 2009 17th European Signal Processing Conference (EUSIPCO 2009), P2549
[8]   A ROBUST ALGORITHM FOR ACCURATE ENDPOINTING OF SPEECH SIGNALS [J].
SAVOJI, MH .
SPEECH COMMUNICATION, 1989, 8 (01) :45-60
[9]  
Shin WH, 2000, INT CONF ACOUST SPEE, P1399, DOI 10.1109/ICASSP.2000.861845
[10]   Word boundary detection with mel-scale frequency bank in noisy environment [J].
Wu, GD ;
Lin, CT .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (05) :541-554