Nonlinear spectral transformations for robust speech recognition

被引:2
|
作者
Ikbal, S [1 ]
Hermansky, H [1 ]
Bourlard, H [1 ]
机构
[1] IDIAP, Martigny, Switzerland
来源
ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03 | 2003年
关键词
D O I
10.1109/ASRU.2003.1318473
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, a nonlinear transformation of autocorrelation coefficients named Phase AutoCorrelation (PAC) coefficients has been considered for feature extraction [1]. PAC based features show improved robustness to additive noise as a result of two operations, performed during the computation of PAC, namely energy normalization and inverse cosine transformation. In spite of the improved robustness achieved for noisy speech, these two operations lead to some degradation in recognition performance for clean speech. In this paper, we try to alleviate this problem, first by introducing the energy information back into the PAC based features, and second by studying alternatives to inverse cosine function. Simply appending the frame energy as an additional coefficient in the PAC features has resulted in noticeable improvement in the performance for clean speech. Study of alternatives to inverse cosine transformation leads to a conclusion that linear transformation is the best for clean speech, while nonlinear functions help to improve robustness in noise.
引用
收藏
页码:393 / 398
页数:6
相关论文
共 50 条
  • [1] Cepstral domain segmental nonlinear feature transformations for robust speech recognition
    Segura, JC
    Benítez, C
    de la Torre, A
    Rubio, AJ
    Ramírez, J
    IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (05) : 517 - 520
  • [2] Spectral estimation and normalisation for robust speech recognition
    Claes, T
    Xie, F
    VanCompernolle, D
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1997 - 2000
  • [3] SPECTRAL ESTIMATION FOR NOISE ROBUST SPEECH RECOGNITION
    ERELL, A
    WEINTRAUB, M
    SPEECH AND NATURAL LANGUAGE, 1989, : 319 - 324
  • [4] Feature Transformations for Robust Speech Recognition in Reverberant Conditions
    Yuliani, Asri R.
    Sustika, Rika
    Yuwana, Raden S.
    Pardede, Hilman F.
    2017 INTERNATIONAL CONFERENCE ON COMPUTER, CONTROL, INFORMATICS AND ITS APPLICATIONS (IC3INA), 2017, : 57 - 62
  • [5] Nonlinear Enhancement of Onset for Robust Speech Recognition
    Kim, Chanwoo
    Stern, Richard M.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2058 - +
  • [6] Temporal Modulation Spectral Restoration for Robust Speech Recognition
    Wang, Svu-Siang
    Tsao, Yu
    2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, : 481 - 486
  • [7] Spectral weighting of SBCOR for noise robust speech recognition
    Kajita, S
    Takeda, K
    Itakura, F
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 621 - 624
  • [8] Robust speech recognition based on spectral adjusting and warping
    Zhao, R
    Wang, Z
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 553 - 556
  • [9] Blind speech separation of nonlinear convolutive mixtures for robust speech recognition
    Koutras, A.
    Dermatas, E.
    Kokkinakis, G.
    Control and Intelligent Systems, 2002, 30 (02) : 83 - 90
  • [10] Parametric nonlinear feature equalization for robust speech recognition
    Garcia, Luz
    Segura, Jose C.
    Ramirez, Javier
    de la Torre, Angel
    Benitez, Carmen
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 529 - 532