Online Unsupervised Classification With Model Comparison in the Variational Bayes Framework for Voice Activity Detection

被引:8
|
作者
Cournapeau, David [1 ,2 ]
Watanabe, Shinji [2 ]
Nakamura, Atsushi [2 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Sch Informat, Kyoto 6068501, Japan
[2] NTT Corp, NTT Commun Sci Labs, Kyoto 6190237, Japan
关键词
Sequential estimation; speech analysis; variational Bayes (VB); voice activity detection (VAD); SPEECH RECOGNITION; EM ALGORITHM;
D O I
10.1109/JSTSP.2010.2080821
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A new online, unsupervised method for Voice Activity Detection (VAD) is proposed. The conventional VAD methods often rely on heuristics to adapt the decision threshold to the estimated SNR. The proposed VAD method is based on the Variational Bayes (VB) approach to the online Expectation Maximization (EM), so that it can automatically adapt the decision level and the statistical model at the same time. We consider two parallel classifiers, one for the noise-only case, and the other for speech-and-noise case. Both models are trained concurrently and online using the VB framework. The VB framework also provides an explicit approximation of the log evidence called free energy. It is used to assess the reliability of the classifier in an online fashion, and to decide which model is more appropriate at a given time frame. Experimental evaluations were conducted on the CENSREC-1-C database designed for VAD evaluations. With the effect of the model comparison, the proposed scheme outperforms the conventional VAD algorithms, especially in the remote recording condition. It is also shown to be more robust with respect to changes of the noise type.
引用
收藏
页码:1071 / 1083
页数:13
相关论文
共 50 条
  • [1] USING ONLINE MODEL COMPARISON IN THE VARIATIONAL BAYES FRAMEWORK FOR ONLINE UNSUPERVISED VOICE ACTIVITY DETECTION
    Cournapeau, David
    Watanabe, Shinji
    Nakamura, Atsushi
    Kawahara, Tatsuya
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4462 - 4465
  • [2] Using Variational Bayes free energy for unsupervised voice activity detection
    Cournapeau, David
    Kawahara, Tatsuya
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4429 - 4432
  • [3] Variational Bayes approach for model aggregation in unsupervised classification with Markovian dependency
    Volant, Stevenn
    Magniette, Marie-Laure Martin
    Robin, Stephane
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (08) : 2375 - 2387
  • [4] Voice Activity Detection Based on an Unsupervised Learning Framework
    Ying, Dongwen
    Yan, Yonghong
    Dang, Jianwu
    Soong, Frank K.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2624 - 2632
  • [5] An improved model for unsupervised voice activity detection
    Sharma, Shilpa
    Malhotra, Rahul
    Sharma, Anurag
    INTERNATIONAL JOURNAL OF NANOTECHNOLOGY, 2023, 20 (1-4) : 235 - 258
  • [6] A Lightweight Framework for Online Voice Activity Detection in the Wild
    Xu, Xuenan
    Dinke, Heinrich
    Wu, Mengyue
    Yu, Kai
    INTERSPEECH 2021, 2021, : 371 - 375
  • [7] Innovative Method for Unsupervised Voice Activity Detection and Classification of Audio Segments
    Ali, Zulfiqar
    Talha, Muhammad
    IEEE ACCESS, 2018, 6 : 15494 - 15504
  • [8] Online model selection based on the variational bayes
    Sato, M
    NEURAL COMPUTATION, 2001, 13 (07) : 1649 - 1681
  • [9] A variational Bayes model for count data learning and classification
    Bakhtiari, Ali Shojaee
    Bouguila, Nizar
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 35 : 176 - 186
  • [10] Online Naive Bayes Classification for Network Intrusion Detection
    Gumus, Fatma
    Sakar, C. Okan
    Erdem, Zeki
    Kursun, Olcay
    2014 PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2014), 2014, : 670 - 674