Noise Robust Voice Activity Detection Using Features Extracted From the Time-Domain Autocorrelation Function

被引:0
|
作者
Ghaemmaghami, Houman [1 ]
Baker, Brendan [1 ]
Vogt, Robbie [1 ]
Sridharan, Sridha [1 ]
机构
[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年
关键词
voice activity detection; high noise; autocorrelation; zero-crossing rate; time-domain analysis; SPEECH;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second system uses a novel combination of cross-correlation and zero-crossing rate of the normalised autocorrelation to approximate a measure of signal pitch and periodicity (CrossCorr) that is hypothesised to be noise robust. The score outputs by the two systems are then merged using weighted sum fusion to create the proposed autocorrelation zero-crossing rate (AZR) VAD. Accuracy of AZR was compared to state-of-the-art and standardised VAD methods and was shown to outperform the best performing system with an average relative improvement of 24.8% in half-total error rate (HTER) on the QUT-NOISE-TIMIT database created using real recordings from high-noise environments.
引用
收藏
页码:3118 / 3121
页数:4
相关论文
共 50 条
  • [31] A time-domain approach to extracting polarization resistance from electrochemical noise data
    Klaasen, RD
    Roberge, PR
    FLUCTUATION AND NOISE LETTERS, 2003, 3 (04): : L455 - L462
  • [32] Cepstral Domain Voice Activity Detection for Improved Noise Modeling in MMSE Feature Enhancement for ASR
    Pettersen, Svein Gunnar
    Johnsen, Magne Hallstein
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1012 - 1015
  • [33] A NEW APPROACH FOR ROBUST REALTIME VOICE ACTIVITY DETECTION USING SPECTRAL PATTERN
    Moattar, M. H.
    Homayounpour, M. M.
    Kalantari, Nima Khademi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4478 - 4481
  • [34] Robust Statistical Voice Activity Detection Using a Likelihood Ratio Sign Test
    Deng, Shiwen
    Han, Jiqing
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 3126 - 3129
  • [35] ON USING SPECTRAL GRADIENT IN CONDITIONAL MAP CRITERION FOR ROBUST VOICE ACTIVITY DETECTION
    Choi, Jae-Hun
    Chang, Joon-Hyuk
    PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC 2012), 2012, : 370 - 374
  • [36] A Weighted Feature Voting Approach for Robust and Real-Time Voice Activity Detection
    Moattar, Mohammad Hossein
    Homayounpour, Mohammad Mehdi
    ETRI JOURNAL, 2011, 33 (01) : 99 - 109
  • [37] VOICE ACTIVITY DETECTION IN TRANSIENT NOISE ENVIRONMENT USING LAPLACIAN PYRAMID ALGORITHM
    Spingarn, Nurit
    Mousazadeh, Saman
    Cohen, Israel
    2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2014, : 238 - 242
  • [38] A Robust, Real-Time Voice Activity Detection Algorithm for Embedded Mobile Devices
    Bian Wu
    Xiaolin Ren
    Chongqing Liu
    Yaxin Zhang
    International Journal of Speech Technology, 2005, 8 (2) : 133 - 146
  • [39] A Robust, Real-Time Voice Activity Detection Algorithm for Embedded Mobile Devices
    Wu, Bian
    Ren, Xiaolin
    Liu, Chongqing
    Zhang, Yaxin
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2005, 8 (02) : 133 - 146
  • [40] Horizontal Spectral Entropy with Long-Span of Time for Robust Voice Activity Detection
    Wang, Kun-Ching
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (09) : 2156 - 2161