Noise Robust Voice Activity Detection Using Features Extracted From the Time-Domain Autocorrelation Function

被引:0
|
作者
Ghaemmaghami, Houman [1 ]
Baker, Brendan [1 ]
Vogt, Robbie [1 ]
Sridharan, Sridha [1 ]
机构
[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年
关键词
voice activity detection; high noise; autocorrelation; zero-crossing rate; time-domain analysis; SPEECH;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second system uses a novel combination of cross-correlation and zero-crossing rate of the normalised autocorrelation to approximate a measure of signal pitch and periodicity (CrossCorr) that is hypothesised to be noise robust. The score outputs by the two systems are then merged using weighted sum fusion to create the proposed autocorrelation zero-crossing rate (AZR) VAD. Accuracy of AZR was compared to state-of-the-art and standardised VAD methods and was shown to outperform the best performing system with an average relative improvement of 24.8% in half-total error rate (HTER) on the QUT-NOISE-TIMIT database created using real recordings from high-noise environments.
引用
收藏
页码:3118 / 3121
页数:4
相关论文
共 50 条
  • [41] A robust and lightweight voice activity detection algorithm for speech enhancement at low signal-to-noise ratio
    Zhu, Zhehui
    Zhang, Lijun
    Pei, Kaikun
    Chen, Siqi
    DIGITAL SIGNAL PROCESSING, 2023, 141
  • [42] Voice activity detection based on conditional random fields using multiple features
    Saito, Akira
    Nankaku, Yoshihiko
    Lee, Akinobu
    Tokuda, Keiichi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2086 - 2089
  • [43] VOICE ACTIVITY DETECTION (VAD) USING BIPOLAR PULSE ACTIVE (BPA) FEATURES
    Safie, Sairul
    Soraghan, John J.
    Petropoulakis, Lykourgos
    2015 IEEE 9TH INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING (WISP), 2015, : 130 - 135
  • [44] Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments
    Shota Morita
    Masashi Unoki
    Xugang Lu
    Masato Akagi
    Journal of Signal Processing Systems, 2016, 82 : 163 - 173
  • [45] Multi-band long-term signal variability features for robust voice activity detection
    Tsiartas, Andreas
    Chaspari, Theodora
    Katsamanis, Nassos
    Ghosh, Prasanta
    Li, Ming
    Van Segbroeck, Maarten
    Potamianos, Alexandros
    Narayanan, Shrikanth S.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 718 - 722
  • [46] Real-Time Seismic Event Detection Using Voice Activity Detection Techniques
    Lara-Cueva, Roman A.
    Sebastian Moreno, Andres
    Larco, Julio C.
    Benitez, Diego S.
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2016, 9 (12) : 5533 - 5542
  • [47] Robust speaker verification in air traffic control using improved voice activity detection
    Neffe, Michael
    Van Pham, Tuan
    Pernkopf, Franz
    Kubin, Gernot
    PROCEEDINGS OF THE FOURTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PATTERN RECOGNITION, AND APPLICATIONS, 2007, : 298 - +
  • [48] ROBUST VOICE ACTIVITY DETECTION USING EMPIRICAL MODE DECOMPOSITION AND MODULATION SPECTRUM ANALYSIS
    Kanai, Yasuaki
    Unoki, Masashi
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 400 - 404
  • [49] A voice activity detection algorithm in spectro-temporal domain using sparse representation
    Eshaghi, Mohadese
    Razzazi, Farbod
    Behrad, Alireza
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (07) : 1791 - 1803
  • [50] Noise Robust Voice Activity Detection Based on Multi-Layer Feed-Forward Neural Network
    Arslan, Ozkan
    Engin, Erkan Zeki
    ELECTRICA, 2019, 19 (02): : 91 - 100