Voice activity detection using neural network

被引:0
作者
Ikedo, J [1 ]
机构
[1] NTT, Wireless Syst Labs, Radio Commun Syst Lab, Yokosuka, Kanagawa 2390847, Japan
关键词
discontinuous transmission; voice activity detection; neural network; background noise and multimedia communication;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Voice activity detection (VAD) is to determine whether a short time speech frame is voice or silence. VAD is useful in reducing the mean speech coding rate by suppressing transmission during silence periods, and is effective in transmitting speech and other data simultaneously. This letter describes a VAD system that uses a neural network. The neural network gets several parameters by analyzing slices of the speech wave form, and outputs only one scalar value related to voice activity This output is compared to a threshold to determine whether the slice is voice or silence. The mean code transfer rate can be reduced to less than 50% by using the proposed VAD system.
引用
收藏
页码:2509 / 2513
页数:5
相关论文
共 7 条
[1]   ITU-T recommendation G.729 Annex B: A silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications [J].
Benyassine, A ;
Shlomot, E ;
Su, HY ;
Massaloux, D ;
Lamblin, C ;
Petit, JP .
IEEE COMMUNICATIONS MAGAZINE, 1997, 35 (09) :64-73
[2]  
Freeman D.K., 1989, P INT C AC SPEECH SI, P369
[3]  
*GSM, 1994, 0610 GSM
[4]  
*GSM, 1994, 0631 GSM
[5]  
*GSM, 1994, 0632 GSM
[6]  
RUMELHSRT DE, 1986, PARALLEL DISTRIBUTED, P329
[7]  
STORNETTA WS, 1987, P IEEE C NEUR NETW, V2, P637