Robust Voice Activity Detection Based on Adaptive Sub-band Energy Sequence Analysis and Harmonic Detection

被引:0
|
作者
Guo, Yanmeng [1 ]
Fu, Qiang [1 ]
Yan, Yonghong [1 ]
机构
[1] Chinese Acad Sci, Inst Acoust, ThinkIT Speech Lab, Beijing 100080, Peoples R China
来源
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年
关键词
voice activity detection; harmonic structure; noise robustness; automatic speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice activity detection (VAD) in real-world noise is a very challenging task. In this paper, a two-step methodology is proposed to solve the problem. First, segments with non-stationary components, including speech and dynamic noise, are located using sub-band energy sequence analysis (SESA). Secondly, voice is detected within the selected segments employing the proposed method concerning its harmonic structure. Therefore, speech segments can be accurately detected by this rule-based framework. This algorithm is evaluated in several databases in terms of speech/non-speech discrimination and in terms of word accuracy rate when it is used as the front-end of automatic speech recognition (ASR) system. It provides a more reliable performance over the commonly used standard methods.
引用
收藏
页码:1637 / 1640
页数:4
相关论文
共 50 条
  • [31] Robust voice activity detection based on noise eigenspace
    Ying, Dongwen
    Shi, Yu
    Lu, Xugang
    Dang, Jianwu
    Soong, Frank
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2007, 28 (06) : 413 - 423
  • [32] Formant-Based Robust Voice Activity Detection
    Yoo, In-Chul
    Lim, Hyeontaek
    Yook, Dongsuk
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2238 - 2245
  • [33] Robust Voice Activity Detection Using Selectively Energy Features
    Wakasugi, Junichiro
    Hayasaka, Noboru
    Iiguni, Youji
    2014 21ST IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS), 2014, : 359 - 362
  • [34] Spectrum Energy Based Voice Activity Detection
    Pang, Jing
    2017 IEEE 7TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE IEEE CCWC-2017, 2017,
  • [35] Blind-detection assisted sub-band adaptive turbo-coded OFDM schemes
    Univ of Southampton, Southampton, United Kingdom
    IEEE Veh Technol Conf, (489-493):
  • [36] An energy-based adaptive voice detection approach
    Zhang, Sen
    2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 158 - 161
  • [37] Wavelet based robust sub-band features for phoneme recognition
    Farooq, O
    Datta, S
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2004, 151 (03): : 187 - 193
  • [38] Blind-detection assisted sub-band adaptive turbo-coded OFDM schemes
    Keller, T
    Hanzo, L
    1999 IEEE 49TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-3: MOVING INTO A NEW MILLENIUM, 1999, : 489 - 493
  • [39] The performance analysis of chinese speech endpoint detection based on continuous multi sub-band spectral features
    He, SN
    Yu, JB
    2002 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS AND WEST SINO EXPOSITION PROCEEDINGS, VOLS 1-4, 2002, : 997 - 1002
  • [40] A study on the pitch extraction detection by linear approximation of sub-band
    Lee, Keun Wang
    Lee, Kwang Hyoung
    Min, So Yeon
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2006, PT 2, 2006, 3981 : 1074 - 1081