Voice activity detection based on using wavelet packet

被引:10
作者
Eshaghi, Mohadese [1 ]
Mollaei, M. R. Karami [1 ]
机构
[1] Univ Technol, Fac Elect & Comp Engn, Babol Sar, Iran
关键词
Speech processing; Voice activity detection; Discrete wavelet packet; ALGORITHMS;
D O I
10.1016/j.dsp.2009.11.008
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper, presents a robust voice activity detection (VAD) technique based on wavelet packet. In this technique sub-bands and their amplitudes are represented as the vectors for each sample time in order to find a new feature from the frequency and amplitude changes. On the other hand, the multi-resolution analysis property of the wavelet packet transform (WPT), the voiced, unvoiced, and transient components of speech can be distinctly discriminated. Then, a new feature extraction method is implemented based on observations of the angles between vectors. This feature extraction method retains most unvoiced sounds in a voice active frame. Experimental results show that the proposed WT feature parameter can extract the speech activity under poor SNR conditions and that it is also insensitive to variable-level of noise. (C) 2009 Elsevier Inc. All rights reserved.
引用
收藏
页码:1102 / 1115
页数:14
相关论文
共 16 条
[1]   ITU-T recommendation G.729 Annex B: A silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications [J].
Benyassine, A ;
Shlomot, E ;
Su, HY ;
Massaloux, D ;
Lamblin, C ;
Petit, JP .
IEEE COMMUNICATIONS MAGAZINE, 1997, 35 (09) :64-73
[2]   A robust voice activity detector for wireless communications using soft computing [J].
Beritelli, F ;
Casale, S ;
Cavallaro, A .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1998, 16 (09) :1818-1829
[3]   AM-FM ENERGY DETECTION AND SEPARATION IN NOISE USING MULTIBAND ENERGY OPERATORS [J].
BOVIK, AC ;
MARAGOS, P ;
QUATIERI, TF .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (12) :3245-3265
[4]  
Cho YD, 2001, IEEE SIGNAL PROC LET, V8, P276, DOI 10.1109/97.957270
[5]  
COHEN A, 1993, CR ACAD SCI I-MATH, V316, P417
[6]   ENTROPY-BASED ALGORITHMS FOR BEST BASIS SELECTION [J].
COIFMAN, RR ;
WICKERHAUSER, MV .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1992, 38 (02) :713-718
[7]  
Freeman D. K., 1989, ICASSP-89: 1989 International Conference on Acoustics, Speech and Signal Processing (IEEE Cat. No.89CH2673-2), P369, DOI 10.1109/ICASSP.1989.266442
[8]   Teager energy based feature parameters for speech recognition in car noise [J].
Jabloun, F ;
Çetin, AE ;
Erzin, E .
IEEE SIGNAL PROCESSING LETTERS, 1999, 6 (10) :259-261
[9]  
KAISER JF, 1990, INT CONF ACOUST SPEE, P381, DOI 10.1109/ICASSP.1990.115702
[10]  
Kondoz A M., 1994, Digital Speech Coding for low bit rate Communication Systems