SUBBAND HYBRID FEATURE FOR MULTI-STREAM SPEECH RECOGNITION

被引:0
作者
Li, Feipeng [1 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
来源
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年
关键词
noise robustness; multi-stream speech recognition; subband feature; PERCEPTION; PHONEME;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A subband hybrid (SBH) feature is developed for multi-stream (MS) speech recognition. The fullband speech signal is decomposed into multiple subbands, each covers about 3 Bark along the frequency. Speech signal is analyzed by a high-resolution filterbank of 4 filters/Bark and a low-resolution filterbank of 2 filters/Bark to facilitate the representation of both short-term spectral modulation and longterm temporal modulation within a frequency subband. Experiments on TIMIT corpus for English and RATS corpus for Arabic Levantine show that the SBH feature significantly enhances the amount of information being extracted from individual subbands. The MS system with performance monitor achieves a substantial gain in performance over the single-stream baseline.
引用
收藏
页数:5
相关论文
共 17 条
  • [1] How Do Humans Process and Recognize Speech?
    Allen, Jont B.
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04): : 567 - 577
  • [2] [Anonymous], 1994, Connectionist Speech Recognition: A Hybrid Approach
  • [3] Autoregressive modeling of temporal envelopes
    Athineos, Marios
    Ellis, Daniel P. W.
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (11) : 5237 - 5245
  • [4] MATHEMATICAL TREATMENT OF CONTEXT EFFECTS IN PHONEME AND WORD RECOGNITION
    BOOTHROYD, A
    NITTROUER, S
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1988, 84 (01) : 101 - 114
  • [5] THE PERCEPTION OF SPEECH AND ITS RELATION TO TELEPHONY
    FLETCHER, H
    GALT, RH
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1950, 22 (02) : 89 - 151
  • [6] Temporal envelope compensation for robust phoneme recognition using modulation spectrum
    Ganapathy, Sriram
    Thomas, Samuel
    Hermansky, Hynek
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 128 (06) : 3769 - 3780
  • [7] Hermansky H., IEEE ICASSP 2013
  • [8] Speech recognition from spectral dynamics
    Hermansky, Hynek
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2011, 36 (05): : 729 - 744
  • [9] Kim C., 2013, IEEE T AUDIO SPEECH
  • [10] Li F., IEEE ICASSP 2013