SUBBAND HYBRID FEATURE FOR MULTI-STREAM SPEECH RECOGNITION

被引：0

作者：

Li, Feipeng ^{[1
]}

机构：

[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年

关键词：

noise robustness; multi-stream speech recognition; subband feature; PERCEPTION; PHONEME;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

A subband hybrid (SBH) feature is developed for multi-stream (MS) speech recognition. The fullband speech signal is decomposed into multiple subbands, each covers about 3 Bark along the frequency. Speech signal is analyzed by a high-resolution filterbank of 4 filters/Bark and a low-resolution filterbank of 2 filters/Bark to facilitate the representation of both short-term spectral modulation and longterm temporal modulation within a frequency subband. Experiments on TIMIT corpus for English and RATS corpus for Arabic Levantine show that the SBH feature significantly enhances the amount of information being extracted from individual subbands. The MS system with performance monitor achieves a substantial gain in performance over the single-stream baseline.

引用

页数：5

共 17 条

[1] How Do Humans Process and Recognize Speech?
Allen, Jont B.
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04): : 567 - 577
[2] [Anonymous], 1994, Connectionist Speech Recognition: A Hybrid Approach
[3] Autoregressive modeling of temporal envelopes
Athineos, Marios
Ellis, Daniel P. W.
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (11) : 5237 - 5245
[4] MATHEMATICAL TREATMENT OF CONTEXT EFFECTS IN PHONEME AND WORD RECOGNITION
BOOTHROYD, A
NITTROUER, S
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1988, 84 (01) : 101 - 114
[5] THE PERCEPTION OF SPEECH AND ITS RELATION TO TELEPHONY
FLETCHER, H
GALT, RH
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1950, 22 (02) : 89 - 151
[6] Temporal envelope compensation for robust phoneme recognition using modulation spectrum
Ganapathy, Sriram
Thomas, Samuel
Hermansky, Hynek
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 128 (06) : 3769 - 3780
[7] Hermansky H., IEEE ICASSP 2013
[8] Speech recognition from spectral dynamics
Hermansky, Hynek
[J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2011, 36 (05): : 729 - 744
[9] Kim C., 2013, IEEE T AUDIO SPEECH
[10] Li F., IEEE ICASSP 2013

← 1 2 →