An improved voice activity detection algorithm for GSM adaptive multi-rate speech codec based on wavelet and support vector machine

被引:0
作者
Chen, Shi-Huang
Chang, Yaotsu
Truong, T. K.
机构
来源
New Trends in Applied Artificial Intelligence, Proceedings | 2007年 / 4570卷
关键词
GSM AMR; VAD; wavelet; support vector machine;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an improved voice activity detection (VAD) algorithm for controlling discontinuous transmission (DTX) of the GSM adaptive multi-rate (AMR) speech codec. First, based on the wavelet transform, the original IIR filter bank and the open-loop pitch detector are implemented via the wavelet filter bank and the wavelet-based pitch detection algorithm, respectively. The proposed wavelet filter bank divides the input speech signal into 9 frequency bands so that the signal level at each sub-band can be calculated. In addition, the background noise can be estimated in each sub-band by using the wavelet de-noising method. The wavelet filter bank is also derived to detect correlated complex signals like music. Then one can apply support vector machine (SVM) to train an optimized non-linear VAD decision rule involving the sub-band power, noise level, pitch period, tone flag, and complex signals warning flag of input speech signals. By the use of the trained SVM, the proposed VAD algorithm can produce more accurate detection results. Various experimental results carried out from the Aurora speech database show that the proposed algorithm gives considerable VAD performances superior to the AMR VAD Option I and comparable with the AMR VAD Option 2.
引用
收藏
页码:915 / 924
页数:10
相关论文
共 11 条
  • [11] 2006, 3GPP TS 26 094 V6 1