Noise Robust Voice Activity Detection Using Features Extracted From the Time-Domain Autocorrelation Function

被引：0

作者：

Ghaemmaghami, Houman ^{[1
]}

Baker, Brendan ^{[1
]}

Vogt, Robbie ^{[1
]}

Sridharan, Sridha ^{[1
]}

机构：

[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年

关键词：

voice activity detection; high noise; autocorrelation; zero-crossing rate; time-domain analysis; SPEECH;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second system uses a novel combination of cross-correlation and zero-crossing rate of the normalised autocorrelation to approximate a measure of signal pitch and periodicity (CrossCorr) that is hypothesised to be noise robust. The score outputs by the two systems are then merged using weighted sum fusion to create the proposed autocorrelation zero-crossing rate (AZR) VAD. Accuracy of AZR was compared to state-of-the-art and standardised VAD methods and was shown to outperform the best performing system with an average relative improvement of 24.8% in half-total error rate (HTER) on the QUT-NOISE-TIMIT database created using real recordings from high-noise environments.

引用

页码：3118 / 3121

页数：4

共 50 条

[31] A time-domain approach to extracting polarization resistance from electrochemical noise data
Klaasen, RD
Roberge, PR
FLUCTUATION AND NOISE LETTERS, 2003, 3 (04): : L455 - L462
[32] Cepstral Domain Voice Activity Detection for Improved Noise Modeling in MMSE Feature Enhancement for ASR
Pettersen, Svein Gunnar
Johnsen, Magne Hallstein
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1012 - 1015
[33] A NEW APPROACH FOR ROBUST REALTIME VOICE ACTIVITY DETECTION USING SPECTRAL PATTERN
Moattar, M. H.
Homayounpour, M. M.
Kalantari, Nima Khademi
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4478 - 4481
[34] Robust Statistical Voice Activity Detection Using a Likelihood Ratio Sign Test
Deng, Shiwen
Han, Jiqing
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 3126 - 3129
[35] ON USING SPECTRAL GRADIENT IN CONDITIONAL MAP CRITERION FOR ROBUST VOICE ACTIVITY DETECTION
Choi, Jae-Hun
Chang, Joon-Hyuk
PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC 2012), 2012, : 370 - 374
[36] A Weighted Feature Voting Approach for Robust and Real-Time Voice Activity Detection
Moattar, Mohammad Hossein
Homayounpour, Mohammad Mehdi
ETRI JOURNAL, 2011, 33 (01) : 99 - 109
[37] VOICE ACTIVITY DETECTION IN TRANSIENT NOISE ENVIRONMENT USING LAPLACIAN PYRAMID ALGORITHM
Spingarn, Nurit
Mousazadeh, Saman
Cohen, Israel
2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2014, : 238 - 242
[38] A Robust, Real-Time Voice Activity Detection Algorithm for Embedded Mobile Devices
Bian Wu
Xiaolin Ren
Chongqing Liu
Yaxin Zhang
International Journal of Speech Technology, 2005, 8 (2) : 133 - 146
[39] A Robust, Real-Time Voice Activity Detection Algorithm for Embedded Mobile Devices
Wu, Bian
Ren, Xiaolin
Liu, Chongqing
Zhang, Yaxin
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2005, 8 (02) : 133 - 146
[40] Horizontal Spectral Entropy with Long-Span of Time for Robust Voice Activity Detection
Wang, Kun-Ching
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (09) : 2156 - 2161

← 1 2 3 4 5 →