Noise robust model-based Voice Activity Detection

被引：0

作者：

de la Torre, Angel ^{[1
]}

Ramirez, Javier ^{[1
]}

Benitez, Carmen ^{[1
]}

Segura, Jose C. ^{[1
]}

Garcia, Luz ^{[1
]}

Rubio, Antonio J. ^{[1
]}

机构：

[1] Univ Granada, Dept Teoria Senal Telemat & Comunicac, E-18071 Granada, Spain

来源：

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年

关键词：

voice activity detection (VAD); vector Taylor series approach (VTS); Gaussian mixture; Wiener filtering;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a model-based VAD derived from the Vector Taylor Series (VTS) approach. A Gaussian mixture (trained with clean speech) is used in order to provide an appropriate decision rule for speech/non-speech detection. Additionally, VTS approach adapts the Gaussian mixture to noise conditions, yielding a stable performance for a wide range of SNRs. We have evaluated its ability for speech/non-speech detection and also its application for robust speech recognition. When compared to other VAD methods, the proposed VAD shows the best trade-off in speech/non-speech detection. When applied for Wiener Filtering and for frame dropping, the proposed VAD also provides the best recognition results.

引用

页码：1954 / 1957

页数：4

共 12 条

[1]

[Anonymous], 1996, THESIS CARNEGIE MELL

[2] Performance evaluation and comparison of G.729/AMR/fuzzy voice activity detectors [J].

Beritelli, F ;

Casale, S ;

Ruggeri, G ;

Serrano, S .

IEEE SIGNAL PROCESSING LETTERS, 2002, 9 (03) :85-88

[3]

HIRSCH H, 2000, ISCA ITRW ASR2000 AU

[4] Robust endpoint detection and energy normalization for real-time speech and speaker recognition [J].

Li, Q ;

Zheng, JS ;

Tsai, A ;

Zhou, QR .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (03) :146-157

[5]

MARZINZIK M, 2002, IEEE T SPEECH AUDIO, V10, P341, DOI DOI 10.1109/TSA.2002.803420

[6]

Moreno PJ, 1996, INT CONF ACOUST SPEE, P733, DOI 10.1109/ICASSP.1996.543225

[7] An effective subband OSF-based VAD with noise reduction for robust speech recognition [J].

Ramírez, J ;

Segura, JC ;

Benítez, C ;

de la Torre, A ;

Rubio, A .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (06) :1119-1129

[8] Efficient voice activity detection algorithms using long-term speech information [J].

Ramírez, J ;

Segura, JC ;

Benítez, C ;

de la Torre, A ;

Rubio, A .

SPEECH COMMUNICATION, 2004, 42 (3-4) :271-287

[9]

Segura J. C., 2001, P EUROSPEECH 01, VI, P221

[10] A statistical model-based voice activity detection [J].

Sohn, J ;

Kim, NS ;

Sung, W .

IEEE SIGNAL PROCESSING LETTERS, 1999, 6 (01) :1-3

← 1 2 →