Voice Activity Detector (VAD) Based on Long-Term Mel Frequency Band Features

被引:7
作者
Salishev, Sergey [1 ]
Barabanov, Andrey [1 ]
Kocharov, Daniil [1 ]
Skrelin, Pavel [1 ]
Moiseev, Mikhail [2 ]
机构
[1] St Petersburg State Univ, St Petersburg, Russia
[2] Intel Corp, Intel Labs, Santa Clara, CA 95054 USA
来源
TEXT, SPEECH, AND DIALOGUE | 2016年 / 9924卷
关键词
Voice Activity Detector; Classification; Decision tree ensemble; Auditory masking;
D O I
10.1007/978-3-319-45510-5_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a VAD using long-term 200 ms Mel frequency band statistics, auditory masking, and a pre-trained two level decision tree ensemble based classifier, which allows capturing syllable level structure of speech and discriminating it from common noises. Proposed algorithm demonstrates on the test dataset almost 100% acceptance of clear voice for English, Chinese, Russian, and Polish speech and 100% rejection of stationary noises independently of loudness. The algorithm is aimed to be used as a trigger for ASR. It reuses short-term FFT analysis (STFFT) from ASR frontend with additional 2KB memory and 15% complexity overhead.
引用
收藏
页码:352 / 358
页数:7
相关论文
共 10 条
  • [1] [Anonymous], 2000, ASR2000 AUTOMATIC SP
  • [2] [Anonymous], 2023961 TSI EG
  • [3] [Anonymous], 93 NASA STIRECON
  • [4] [Anonymous], 2010, SOURCE SEPARATION PR
  • [5] [Anonymous], 2006, PSYCHOACOUSTICS FACT
  • [6] Fant G., 1971, ACOUSTIC THEORY SPEE
  • [7] Features for voice activity detection: a comparative analysis
    Graf, Simon
    Herbig, Tobias
    Buck, Markus
    Schmidt, Gerhard
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,
  • [8] Efficient voice activity detection algorithms using long-term speech information
    Ramírez, J
    Segura, JC
    Benítez, C
    de la Torre, A
    Rubio, A
    [J]. SPEECH COMMUNICATION, 2004, 42 (3-4) : 271 - 287
  • [9] A statistical model-based voice activity detection
    Sohn, J
    Kim, NS
    Sung, W
    [J]. IEEE SIGNAL PROCESSING LETTERS, 1999, 6 (01) : 1 - 3
  • [10] Zhou Z.-H., 2012, Ensemble methods: foundations and algorithms, DOI DOI 10.1201/B12207