A Vector Space Approach to Environment Modeling for Robust Speech Recognition

被引:0
|
作者
Tsao, Yu [1 ]
Lee, Chin-Hui [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
来源
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年
关键词
acoustic modeling; environment adaptation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a vector space approach to characterizing environments for robust speech recognition. We represent a given environment by a super-vector formed by concatenating all the mean vectors of the Gaussian mixture components of the state observation densities of all hidden Markov models trained in the particular environment. New environment super-vectors can now be obtained either by an interpolation method with a collection of super-vectors trained from many real or simulated environments or by a transformation performed on an anchor super-vector for a specific environment, such as a clean condition. At a 5dB signal-to-noise (SNR) level, both interpolation- and transformation-based approaches achieve a significant error rate reduction of close to 47% from a baseline system with cepstral mean subtraction (CMS) with only two adaptation utterances. When incorporating N-best information to perform unsupervised adaptation at 5dB SNR with the same two utterances, we achieve a relative error reduction of about 40%, close to that achieved in the supervised mode.
引用
收藏
页码:785 / 788
页数:4
相关论文
共 50 条
  • [21] ROBUST FEATURE SPACE ADAPTATION FOR TELEPHONY SPEECH RECOGNITION
    Lei, Xin
    Hamaker, Jon
    He, Xiaodong
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 773 - +
  • [22] COMBINING EIGENVOICE SPEAKER MODELING AND VTS-BASED ENVIRONMENT COMPENSATION FOR ROBUST SPEECH RECOGNITION
    Ou, Zhijian
    Deng, Kan
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4673 - 4676
  • [23] Robust recognition of cellular telephone speech by adaptive vector quantization
    Sonmez, MK
    Rajasekaran, R
    Baras, JS
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 503 - 506
  • [24] Modeling auditory perception to improve robust speech recognition
    Strope, B
    Alwan, A
    THIRTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 1998, : 1056 - 1060
  • [25] An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition
    Wu, Bo
    Li, Kehuang
    Ge, Fengpei
    Huang, Zhen
    Yang, Minglei
    Siniscalchi, Sabato Marco
    Lee, Chin-Hui
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2017, 11 (08) : 1289 - 1300
  • [26] A COLLABORATIVE SPEECH ENHANCEMENT APPROACH FOR SPEECH RECOGNITION IN MOTORCYCLE ENVIRONMENT
    Mporas, Iosif
    Kocsis, Otilia
    Ganchev, Todor
    Fakotakis, Nikos
    2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, 2009, : 1254 - 1259
  • [27] NOISE ADAPTIVE TRAINING USING A VECTOR TAYLOR SERIES APPROACH FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION
    Kalinli, Ozlem
    Seltzer, Michael L.
    Acero, Alex
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3825 - 3828
  • [28] A maximum likelihood approach to unsupervised online adaptation of stochastic vector mapping function for robust speech recognition
    Zhu, Donglai
    Huo, Qiang
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 773 - +
  • [29] ROBUST SPEECH RECOGNITION THROUGH SELECTION OF SPEAKER AND ENVIRONMENT TRANSFORMS
    Bilgi, Raghavendra
    Joshi, Vikas
    Umesh, S.
    Garcia, L.
    Benitez, C.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4333 - 4336
  • [30] IMPULSE RESPONSE ESTIMATION FOR ROBUST SPEECH RECOGNITION IN A REVERBERANT ENVIRONMENT
    Ravanelli, Mirco
    Sosi, Alessandro
    Svaizer, Piergiorgio
    Omologo, Maurizio
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 1668 - 1672