Robust Voice Activity Detection Based on Adaptive Sub-band Energy Sequence Analysis and Harmonic Detection

被引:0
|
作者
Guo, Yanmeng [1 ]
Fu, Qiang [1 ]
Yan, Yonghong [1 ]
机构
[1] Chinese Acad Sci, Inst Acoust, ThinkIT Speech Lab, Beijing 100080, Peoples R China
来源
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年
关键词
voice activity detection; harmonic structure; noise robustness; automatic speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice activity detection (VAD) in real-world noise is a very challenging task. In this paper, a two-step methodology is proposed to solve the problem. First, segments with non-stationary components, including speech and dynamic noise, are located using sub-band energy sequence analysis (SESA). Secondly, voice is detected within the selected segments employing the proposed method concerning its harmonic structure. Therefore, speech segments can be accurately detected by this rule-based framework. This algorithm is evaluated in several databases in terms of speech/non-speech discrimination and in terms of word accuracy rate when it is used as the front-end of automatic speech recognition (ASR) system. It provides a more reliable performance over the commonly used standard methods.
引用
收藏
页码:1637 / 1640
页数:4
相关论文
共 50 条
  • [41] On Noise Robust Voice Activity Detection
    Dekens, Tomas
    Verhelst, Werner
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2660 - 2663
  • [42] GSPAP based sub-band adaptive feedback cancellation algorithm
    Liang W.
    Zheng F.
    Chen C.
    Chen G.
    1600, Tsinghua University (57): : 707 - 712
  • [43] Open-Circuit Fault Detection in a Multilevel Inverter Using Sub-Band Wavelet Energy
    Khan, Faisal A.
    Shees, Mohammad Munawar
    Alsharekh, Mohammed F.
    Alyahya, Saleh
    Saleem, Faisal
    Baghel, Vipul
    Sarwar, Adil
    Islam, Muhammad
    Khan, Sheroz
    ELECTRONICS, 2022, 11 (01)
  • [44] A NOVEL FINGERPRINT SMEAR DETECTION METHOD BASED ON INTEGRATED SUB-BAND FEATURE REPRESENTATION
    Yang, Xiukun
    Yang, Zhigang
    2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 3065 - 3068
  • [45] Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding
    Joseph, Shijo M.
    Babu, Anto P.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (03) : 537 - 550
  • [46] Robust voice-activity detection based on the wavelet transform
    Stegmann, J
    Schroder, G
    1997 IEEE WORKSHOP ON SPEECH CODING FOR TELECOMMUNICATIONS, PROCEEDINGS: BACK TO BASICS: ATTACKING FUNDAMENTAL PROBLEMS IN SPEECH CODING, 1997, : 99 - 100
  • [47] AN ADAPTIVE VOICE ACTIVITY DETECTION ALGORITHM
    Zhang Zhigang
    Huang Junqin
    INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, 2015, 8 (04): : 2175 - 2194
  • [48] A robust voice activity detection based on noise eigenspace projection
    Ying, Dongwen
    Shi, Yu
    Soong, Frank
    Dang, Jianwu
    Lu, Xugang
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 76 - +
  • [49] Noise robust model-based Voice Activity Detection
    de la Torre, Angel
    Ramirez, Javier
    Benitez, Carmen
    Segura, Jose C.
    Garcia, Luz
    Rubio, Antonio J.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1954 - 1957
  • [50] An RNN and CRNN Based Approach to Robust Voice Activity Detection
    Wang, Guan-Bo
    Zhang, Wei-Qiang
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1347 - 1350