Noise Robust Voice Activity Detection Using Features Extracted From the Time-Domain Autocorrelation Function

被引:0
|
作者
Ghaemmaghami, Houman [1 ]
Baker, Brendan [1 ]
Vogt, Robbie [1 ]
Sridharan, Sridha [1 ]
机构
[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年
关键词
voice activity detection; high noise; autocorrelation; zero-crossing rate; time-domain analysis; SPEECH;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second system uses a novel combination of cross-correlation and zero-crossing rate of the normalised autocorrelation to approximate a measure of signal pitch and periodicity (CrossCorr) that is hypothesised to be noise robust. The score outputs by the two systems are then merged using weighted sum fusion to create the proposed autocorrelation zero-crossing rate (AZR) VAD. Accuracy of AZR was compared to state-of-the-art and standardised VAD methods and was shown to outperform the best performing system with an average relative improvement of 24.8% in half-total error rate (HTER) on the QUT-NOISE-TIMIT database created using real recordings from high-noise environments.
引用
收藏
页码:3118 / 3121
页数:4
相关论文
共 50 条
  • [21] Robust Voice Activity Detection Using Gammatone Filtering and Entropy
    Ong, W. Q.
    Tan, A. W. C.
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON ROBOTICS, AUTOMATION AND SCIENCES (ICORAS 2016), 2016,
  • [22] Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments
    Morita, Shota
    Unoki, Masashi
    Lu, Xugang
    Akagi, Masato
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 108 - +
  • [23] Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments
    Morita, Shota
    Unoki, Masashi
    Lu, Xugang
    Akagi, Masato
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2016, 82 (02): : 163 - 173
  • [24] Time-Domain Receiver Function Deconvolution Using Genetic Algorithm
    Moreira, Lucas Paes
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (08) : 1328 - 1332
  • [25] TIME-DOMAIN FEATURES AND PROBABILISTIC NEURAL NETWORK FOR THE DETECTION OF VOCAL FOLD PATHOLOGY
    Hariharan, M.
    Paulraj, M. P.
    Yaacob, Sazali
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2010, 23 (01) : 60 - 67
  • [26] Confidence measures for acoustic detection of film slates based on time-domain features
    Schlosser, Markus S.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 137 - 140
  • [27] Keyword Spotting using Time-domain Features in a Temporal Convolutional Network
    Ibrahim, Emad A.
    Huisken, Jos
    Fatemi, Hamed
    de Gyvez, Jose Pineda
    2019 22ND EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2019, : 313 - 319
  • [28] IMPROVED VOICE ACTIVITY DETECTION USING STATIC HARMONIC FEATURES
    Fukuda, Takashi
    Ichikawa, Osamu
    Nishimura, Masafumi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4482 - 4485
  • [29] Voice Activity Detection in Presence of Transient Noise Using Spectral Clustering
    Mousazadeh, Saman
    Cohen, Israel
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1261 - 1271
  • [30] Electroencephalography based imagined alphabets classification using spatial and time-domain features
    Agarwal, Prabhakar
    Kumar, Sandeep
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2022, 32 (01) : 111 - 122