Robust speech recognition based on independent vector analysis using harmonic frequency dependency

被引：4

作者：

Jun, Soram ^{[1
]}

Kim, Minook ^{[1
]}

Oh, Myungwoo ^{[1
]}

Park, Hyung-Min ^{[1
]}

机构：

[1] Sogang Univ, Dept Elect Engn, Seoul 121742, South Korea

来源：

NEURAL COMPUTING & APPLICATIONS | 2013年 / 22卷 / 7-8期

基金：

新加坡国家研究基金会;

关键词：

Robust speech recognition; Independent vector analysis; Missing feature technique; Blind source separation; BLIND SOURCE SEPARATION; MUSIC;

D O I：

10.1007/s00521-012-1002-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes an algorithm that enhances speech by independent vector analysis (IVA) using harmonic frequency dependency for robust speech recognition. While the conventional IVA exploits the full-band uniform dependencies of each source signal, a harmonic clique model is introduced to improve the enhancement performance by modeling strong dependencies among multiples of fundamental frequencies. An IVA-based learning algorithm is derived to consider the non-holonomic constraint and the minimal distortion principle to reduce the unavoidable distortion of IVA, and the minimum power distortionless response beamformer is used as a pre-processing step. In addition, the algorithm compares the log-spectral features of the enhanced speech and observed noisy speech to identify time-frequency segments corrupted by noise and restores those with the cluster-based missing feature reconstruction technique. Experimental results demonstrate that the proposed method enhances recognition performance significantly in noisy environments, especially with competing interference.

引用

页码：1321 / 1327

页数：7

共 50 条

[41] ROBUST SPEECH RECOGNITION USING MULTIVARIATE COPULA MODELS
Bayestehtashk, Alireza
Shafran, Izhak
Babaeian, Amir
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5890 - 5894
[42] REAL-TIME INDEPENDENT VECTOR ANALYSIS WITH STUDENT'S T SOURCE PRIOR FOR CONVOLUTIVE SPEECH MIXTURES
Harris, Jack
Rivet, Bertrand
Naqvi, Syed Mohsen
Chambers, Jonathon A.
Jutten, Christian
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 1856 - 1860
[43] Binary and ratio time-frequency masks for robust speech recognition
Srinivasan, Soundararajan
Roman, Nicoleta
Wang, DeLiang
SPEECH COMMUNICATION, 2006, 48 (11) : 1486 - 1501
[44] Subspace independent component analysis using vector kurtosis
Sharma, Alok
Paliwal, Kuldip K.
PATTERN RECOGNITION, 2006, 39 (11) : 2227 - 2232
[45] AN ALTERNATIVE PROOF FOR THE IDENTIFIABILITY OF INDEPENDENT VECTOR ANALYSIS USING SECOND ORDER STATISTICS
Lahat, Dana
Jutten, Christian
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 4363 - 4367
[46] AN AUDITORY-BASED FEATURE FOR ROBUST SPEECH RECOGNITION
Shao, Yang
Jin, Zhaozhang
Wang, DeLiang
Srinivasan, Soundararajan
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4625 - +
[47] A SPARSITY BASED PREPROCESSING FOR NOISE ROBUST SPEECH RECOGNITION
Koniaris, Christos
Chatterjee, Saikat
2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 513 - 518
[48] Time-Frequency Masking For Large Scale Robust Speech Recognition
Wang, Yuxuan
Misra, Ananya
Chine, Kean K.
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2469 - 2473
[49] Independent vector analysis with a generalized multivariate Gaussian source prior for frequency domain blind source separation
Liang, Yanfeng
Harris, Jack
Naqvi, Syed Mohsen
Chen, Gaojie
Chambers, Jonathon A.
SIGNAL PROCESSING, 2014, 105 : 175 - 184
[50] A computational auditory scene analysis system for speech segregation and robust speech recognition
Shao, Yang
Srinivasan, Soundararajan
Jin, Zhaozhang
Wang, DeLiang
COMPUTER SPEECH AND LANGUAGE, 2010, 24 (01) : 77 - 93

← 1 2 3 4 5 →