Robust Voice Activity Detection Using Gammatone Filtering and Entropy

被引:0
作者
Ong, W. Q. [1 ]
Tan, A. W. C. [1 ]
机构
[1] Multimedia Univ, Fac Engn & Technol, Commun Syst & Algorithms Res Lab, Jalan Ayer Keroh Lama, Melaka 75450, Malaysia
来源
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON ROBOTICS, AUTOMATION AND SCIENCES (ICORAS 2016) | 2016年
关键词
Voice activity detector (VAD); gammatone filtering; information-theoretic measures; SPEECH;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Voice activity detector (VAD) is used to detect the presence or absence of human voice in a signal. A robust VAD algorithm is essential to distinguish human voice in a noisy acoustic signal. There were many recent works in development of robust VAD which focus on unsupervised features extraction such as temporal variation, signal-to-noise ratio in [1] and etc. However, these methods are typically sensitive to nonstationary noise especially under low SNR. To overcome these problems, this paper presents a robust voice activity detection (VAD) method via a combination of gammatone filtering and entropy as an information-theoretic measure in the detection algorithm. The performance of the proposed algorithm is tested using speech signals from TIMIT test corpus with additive noise at varying degrees of signal-to-noise ratio. The results show that the proposed robust VAD outperforms other existing methods in terms of detection accuracy.
引用
收藏
页数:5
相关论文
共 14 条
[1]   Single Frequency Filtering Approach for Discriminating Speech and Nonspeech [J].
Aneeja, G. ;
Yegnanarayana, B. .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (04) :705-717
[2]  
ETSI, 1999, 301708 ETSI EN, V7, P1
[3]  
Garofolo J. S., 1993, TIMIT ACOUSTIC PHONE
[4]  
Germain FG, 2013, INTERSPEECH, P732
[5]  
Jiang W., 2010, INT C AUD LANG IM PR
[6]  
JOHANNESMA PIM, 1972, P S HEAR THEOR
[7]   Practical Gammatone-Like Filters for Auditory Processing [J].
Katsiamis, A. G. ;
Drakakis, E. M. ;
Lyon, R. F. .
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2007, 2007 (1)
[8]  
Lindberg B., 2009, P INT C SPOK LANG PR
[9]  
Qi J, 2013, IEEE INT SYMP CIRC S, P305, DOI 10.1109/ISCAS.2013.6571843
[10]  
Uvliden A, 1998, CONF REC ASILOMAR C, P343, DOI 10.1109/ACSSC.1998.750883