Efficient voice activity detection algorithm using long-term spectral flatness measure

被引:52
|
作者
Ma, Yanna [1 ]
Nishihara, Akinori [1 ]
机构
[1] Tokyo Inst Technol, Dept Commun & Integrated Syst, Tokyo 1528552, Japan
来源
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2013年
关键词
NOISE;
D O I
10.1186/1687-4722-2013-21
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a novel and robust voice activity detection (VAD) algorithm utilizing long-term spectral flatness measure (LSFM) which is capable of working at 10 dB and lower signal-to-noise ratios(SNRs). This new LSFM-based VAD improves speech detection robustness in various noisy environments by employing a low-variance spectrum estimate and an adaptive threshold. The discriminative power of the new LSFM feature is shown by conducting an analysis of the speech/non-speech LSFM distributions. The proposed algorithm was evaluated under 12 types of noises (11 from NOISEX-92 and speech-shaped noise) and five types of SNR in core TIMIT test corpus. Comparisons with three modern standardized algorithms (ETSI adaptive multi-rate (AMR) options AMR1 and AMR2 and ITU-T G.729) demonstrate that our proposed LSFM-based VAD scheme achieved the best average accuracy rate. A long-term signal variability (LTSV)-based VAD scheme is also compared with our proposed method. The results show that our proposed algorithm outperforms the LTSV-based VAD scheme for most of the noises considered including difficult noises like machine gun noise and speech babble noise.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Efficient voice activity detection algorithm using long-term spectral flatness measure
    Yanna Ma
    Akinori Nishihara
    EURASIP Journal on Audio, Speech, and Music Processing, 2013
  • [2] Erratum to: Efficient voice activity detection algorithm using long-term spectral flatness measure
    Yanna Ma
    Akinori Nishihara
    EURASIP Journal on Audio, Speech, and Music Processing, 2015
  • [3] Efficient voice activity detection algorithm using long-term spectral flatness measure (vol 2013, 21, 2013)
    Ma, Yanna
    Nishihara, Akinori
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015,
  • [4] Efficient voice activity detection algorithms using long-term speech information
    Ramírez, J
    Segura, JC
    Benítez, C
    de la Torre, A
    Rubio, A
    SPEECH COMMUNICATION, 2004, 42 (3-4) : 271 - 287
  • [5] Robust Voice Activity Detection Using the Combination of Short-Term and Long-Term Spectral Patterns
    Tan, Yingwei
    Liu, Wenju
    PATTERN RECOGNITION (CCPR 2014), PT II, 2014, 484 : 428 - 435
  • [6] Voice-Activity Detection Using Long-Term Sub-Band Entropy Measure
    Wang, Kun-Ching
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2012, E95A (09) : 1606 - 1609
  • [7] Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure
    Sang-Ick Kang
    Joon-Hyuk Chang
    Circuits, Systems and Signal Processing, 2010, 29 : 183 - 194
  • [8] Voice Activity Detection Based on Discriminative Weight Training Incorporating a Spectral Flatness Measure
    Kang, Sang-Ick
    Chang, Joon-Hyuk
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2010, 29 (02) : 183 - 194
  • [9] Voice activity detection algorithm based on long-term pitch information
    Yang, Xu-Kui
    He, Liang
    Qu, Dan
    Zhang, Wei-Qiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016,
  • [10] Voice activity detection algorithm based on long-term pitch information
    Xu-Kui Yang
    Liang He
    Dan Qu
    Wei-Qiang Zhang
    EURASIP Journal on Audio, Speech, and Music Processing, 2016