Robust speech recognition based on independent vector analysis using harmonic frequency dependency

被引:4
作者
Jun, Soram [1 ]
Kim, Minook [1 ]
Oh, Myungwoo [1 ]
Park, Hyung-Min [1 ]
机构
[1] Sogang Univ, Dept Elect Engn, Seoul 121742, South Korea
基金
新加坡国家研究基金会;
关键词
Robust speech recognition; Independent vector analysis; Missing feature technique; Blind source separation; BLIND SOURCE SEPARATION; MUSIC;
D O I
10.1007/s00521-012-1002-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes an algorithm that enhances speech by independent vector analysis (IVA) using harmonic frequency dependency for robust speech recognition. While the conventional IVA exploits the full-band uniform dependencies of each source signal, a harmonic clique model is introduced to improve the enhancement performance by modeling strong dependencies among multiples of fundamental frequencies. An IVA-based learning algorithm is derived to consider the non-holonomic constraint and the minimal distortion principle to reduce the unavoidable distortion of IVA, and the minimum power distortionless response beamformer is used as a pre-processing step. In addition, the algorithm compares the log-spectral features of the enhanced speech and observed noisy speech to identify time-frequency segments corrupted by noise and restores those with the cluster-based missing feature reconstruction technique. Experimental results demonstrate that the proposed method enhances recognition performance significantly in noisy environments, especially with competing interference.
引用
收藏
页码:1321 / 1327
页数:7
相关论文
共 50 条
  • [41] ROBUST SPEECH RECOGNITION USING MULTIVARIATE COPULA MODELS
    Bayestehtashk, Alireza
    Shafran, Izhak
    Babaeian, Amir
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5890 - 5894
  • [42] REAL-TIME INDEPENDENT VECTOR ANALYSIS WITH STUDENT'S T SOURCE PRIOR FOR CONVOLUTIVE SPEECH MIXTURES
    Harris, Jack
    Rivet, Bertrand
    Naqvi, Syed Mohsen
    Chambers, Jonathon A.
    Jutten, Christian
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 1856 - 1860
  • [43] Binary and ratio time-frequency masks for robust speech recognition
    Srinivasan, Soundararajan
    Roman, Nicoleta
    Wang, DeLiang
    SPEECH COMMUNICATION, 2006, 48 (11) : 1486 - 1501
  • [44] Subspace independent component analysis using vector kurtosis
    Sharma, Alok
    Paliwal, Kuldip K.
    PATTERN RECOGNITION, 2006, 39 (11) : 2227 - 2232
  • [45] AN ALTERNATIVE PROOF FOR THE IDENTIFIABILITY OF INDEPENDENT VECTOR ANALYSIS USING SECOND ORDER STATISTICS
    Lahat, Dana
    Jutten, Christian
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 4363 - 4367
  • [46] AN AUDITORY-BASED FEATURE FOR ROBUST SPEECH RECOGNITION
    Shao, Yang
    Jin, Zhaozhang
    Wang, DeLiang
    Srinivasan, Soundararajan
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4625 - +
  • [47] A SPARSITY BASED PREPROCESSING FOR NOISE ROBUST SPEECH RECOGNITION
    Koniaris, Christos
    Chatterjee, Saikat
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 513 - 518
  • [48] Time-Frequency Masking For Large Scale Robust Speech Recognition
    Wang, Yuxuan
    Misra, Ananya
    Chine, Kean K.
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2469 - 2473
  • [49] Independent vector analysis with a generalized multivariate Gaussian source prior for frequency domain blind source separation
    Liang, Yanfeng
    Harris, Jack
    Naqvi, Syed Mohsen
    Chen, Gaojie
    Chambers, Jonathon A.
    SIGNAL PROCESSING, 2014, 105 : 175 - 184
  • [50] A computational auditory scene analysis system for speech segregation and robust speech recognition
    Shao, Yang
    Srinivasan, Soundararajan
    Jin, Zhaozhang
    Wang, DeLiang
    COMPUTER SPEECH AND LANGUAGE, 2010, 24 (01) : 77 - 93