High resolution speech feature parametrization for monophone-based stressed speech recognition

被引:43
|
作者
Sarikaya, R [1 ]
Hansen, JHL [1 ]
机构
[1] Univ Colorado, Ctr Spoken Language Res, Robust Speech Proc Lab, Boulder, CO 80309 USA
关键词
feature extraction; speech recognition; speech under stress; wavelet analysis;
D O I
10.1109/97.847363
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This letter investigates the impact of stress on monophone speech recognition accuracy and proposes a new set of acoustic parameters based on high resolution wavelet analysis. The two parameter schemes are entitled wavelet packet parameters (WPP) and subband-based cepstral parameters (SBC). The performance of these features is compared to traditional Mel-frequency cepstral coefficients (MFCC) for stressed speech monophone recognition. The stressed speaking styles considered areneutral, angry, loud, and Lombard effect(1) speech from the SUSAS database. An overall monophone recognition improvement of 20.4% and 17.2% is achieved for loud and angry stressed speech, with a corresponding increase in the neutral monophone rate of 9.9% over MFCC parameters.
引用
收藏
页码:182 / 185
页数:4
相关论文
共 50 条
  • [11] Speech Audio Super-Resolution For Speech Recognition
    Li, Xinyu
    Chebiyyam, Venkata
    Kirchhoff, Katrin
    INTERSPEECH 2019, 2019, : 3416 - 3420
  • [12] Auditory-model based robust feature selection for speech recognition
    Koniaris, Christos
    Kuropatwinski, Marcin
    Kleijn, W. Bastiaan
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 127 (02) : EL73 - EL79
  • [13] A Feature Extraction Method based on Combined Wavelets Filter in Speech Recognition
    Zhang, Xueying
    Sun, Ying
    Hou, Wenjun
    2008 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 1236 - +
  • [14] Robust Feature Extraction for Speech Recognition Based on Perceptually Motivated MUSIC
    Han Zhi-yan
    Wang Jian
    PROCEEDINGS 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, (ICCSIT 2010), VOL 1, 2010, : 98 - 102
  • [15] Entropy-Based Feature Analysis for Speech Recognition
    Setiawan, Panji
    Hoege, Harald
    Fingscheidt, Tim
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2927 - +
  • [16] Speech recognition with emphasis on wavelet based feature extraction
    Farooq, O
    Datta, S
    IETE JOURNAL OF RESEARCH, 2002, 48 (01) : 3 - 13
  • [17] Speech recognition as feature extraction for speaker recognition
    Stolcke, A.
    Shriberg, E.
    Ferrer, L.
    Kajarekar, S.
    Sonmez, K.
    Tur, G.
    2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 39 - +
  • [18] Lost Speech Reconstruction Method using Speech Recognition based on Missing Feature Theory and HMM-based Speech Synthesis
    Kuroiwa, Shingo
    Tsuge, Satoru
    Ren, Fuji
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1105 - 1108
  • [19] FEATURE TRANSFORMATION BASED ON DISCRIMINANT ANALYSIS PRESERVING LOCAL STRUCTURE FOR SPEECH RECOGNITION
    Sakai, Makoto
    Kitaoka, Norihide
    Takeda, Kazuya
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3813 - +
  • [20] APPLYING FEATURE EXTRACTION OF SPEECH RECOGNITION ON VOIP AUDITING
    Wang, Xuan
    Lin, Jiancheng
    Sun, Yong
    Gan, Haibo
    Yao, Lin
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 5 (07): : 1851 - 1856