Voice Activity Detection in Presence of Transient Noise Using Spectral Clustering

被引:38
|
作者
Mousazadeh, Saman [1 ]
Cohen, Israel [1 ]
机构
[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2013年 / 21卷 / 06期
基金
以色列科学基金会;
关键词
Gaussian mixture model; spectral clustering; transient noise; voice activity detection; ACOUSTIC EVENT DETECTION;
D O I
10.1109/TASL.2013.2248717
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Voice activity detection has attracted significant research efforts in the last two decades. Despite much progress in designing voice activity detectors, voice activity detection (VAD) in presence of transient noise is a challenging problem. In this paper, we develop a novel VAD algorithm based on spectral clustering methods. We propose a VAD technique which is a supervised learning algorithm. This algorithm divides the input signal into two separate clusters (i.e., speech presence and speech absence frames). We use labeled data in order to adjust the parameters of the kernel used in spectral clustering methods for computing the similarity matrix. The parameters obtained in the training stage together with the eigenvectors of the normalized Laplacian of the similarity matrix and Gaussianmixture model (GMM) are utilized to compute the likelihood ratio needed for voice activity detection. Simulation results demonstrate the advantage of the proposed method compared to conventional statistical model-based VAD algorithms in presence of transient noise.
引用
收藏
页码:1261 / 1271
页数:11
相关论文
共 50 条
  • [41] Voice activity detection based on conditional MAP criterion incorporating the spectral gradient
    Kim, Sang-Kyun
    Chang, Joon-Hyuk
    SIGNAL PROCESSING, 2012, 92 (07) : 1699 - 1705
  • [42] Voice activity detection based on using wavelet packet
    Eshaghi, Mohadese
    Mollaei, M. R. Karami
    DIGITAL SIGNAL PROCESSING, 2010, 20 (04) : 1102 - 1115
  • [43] Noise robust voice activity detection based on periodic to aperiodic component ratio
    Ishizuka, Kentaro
    Nakatani, Tomohiro
    Fujimoto, Masakiyo
    Miyazaki, Noboru
    SPEECH COMMUNICATION, 2010, 52 (01) : 41 - 60
  • [44] NOISE ROBUST VOICE ACTIVITY DETECTION USING NORMAL PROBABILITY TESTING AND TIME-DOMAIN HISTOGRAM ANALYSIS
    Ghaemmaghami, Houman
    Dean, David
    Sridharan, Sridha
    McCowan, Iain
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4470 - 4473
  • [45] Noise Cancellation Based on Voice Activity Detection Using Spectra Variation for Speech recognition in Smart Home Devices
    Park, Jeong-Sik
    Kim, Seok-Hoon
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2020, 26 (01) : 149 - 159
  • [46] Noise Robust Voice Activity Detection Using Features Extracted From the Time-Domain Autocorrelation Function
    Ghaemmaghami, Houman
    Baker, Brendan
    Vogt, Robbie
    Sridharan, Sridha
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 3118 - 3121
  • [47] Concurrent processing of voice activity detection and noise reduction using empirical mode decomposition and modulation spectrum analysis
    Kanai, Yasuaki
    Morita, Shota
    Unoki, Masashi
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 742 - 746
  • [48] Speaker-Dependent Voice Activity Detection Robust to Background Speech Noise
    Matsuda, Shigeki
    Ito, Naoya
    Tsujino, Kosuke
    Kashioka, Hideki
    Sagayama, Shigeki
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2625 - 2628
  • [49] The QUT-NOISE-TIMIT Corpus for the Evaluation of Voice Activity Detection Algorithms
    Dean, David
    Sridharan, Sridha
    Vogt, Robert
    Mason, Michael
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 3110 - 3113
  • [50] INTEGRATION OF SPORADIC NOISE MODEL IN POMDP-BASED VOICE ACTIVITY DETECTION
    Park, Chiyoun
    Kim, Namhoon
    Cho, Jeongmi
    Kim, Jeongsu
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4486 - 4489