Using Artificial Neural Network For Robust Voice Activity Detection Under Adverse Conditions

被引:0
作者
Pham, Tuan V. [1 ]
Tang, Chien T. [1 ]
Stadtschnitzer, Michael [2 ]
机构
[1] Univ Danang, Univ Technol, Elect & Telecomm Engr Dept, Danang, Vietnam
[2] Graz Univ Technol, Graz Signal Proc & Speech Comm Inst, Inst Appl Syst Technol, JOANNEUM Res, Graz, Austria
来源
2009 IEEE-RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES: RESEARCH, INNOVATION AND VISION FOR THE FUTURE | 2009年
关键词
MODEL; RECOGNITION; VAD;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present an approach to model-based voice activity detection (VAD) for harsh environments. By using mel-frequency cepstral coefficients feature extracted from clean and noisy speech samples, an artificial neural network is trained optimally in order to provide a reliable model. There are three main aspects to this study: First, in addition to the developed model, recent state-of-the-art VAD methods are analyzed extensively. Second, we present an optimization procedure of neural network training, including evaluation of trained network performance with proper measures. Third, a large assortment of empirical results on the noisy TIMIT and SNOW corpuses including different types of noise at different signal-to-noise ratios is provided. We evaluate the built VAD model on the noisy corpuses and compare against the state-of-the-art VAD methods such as the ITU-T Rec. G. 729 Annex B, the ETSI AFE ES 202 050, and recently promising VAD algorithms. Results show that: (i) the proposed neural network classifier employing MFCC feature provides robustly high scores under different noisy conditions; (ii) the invented model is superior to other VAD methods in terms of various classification measures; (iii) the robustness of the developed VAD algorithm is still hold in the case of testing it with the completely mismatched environment.
引用
收藏
页码:35 / +
页数:2
相关论文
共 20 条