High resolution speech feature parametrization for monophone-based stressed speech recognition

被引:43
|
作者
Sarikaya, R [1 ]
Hansen, JHL [1 ]
机构
[1] Univ Colorado, Ctr Spoken Language Res, Robust Speech Proc Lab, Boulder, CO 80309 USA
关键词
feature extraction; speech recognition; speech under stress; wavelet analysis;
D O I
10.1109/97.847363
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This letter investigates the impact of stress on monophone speech recognition accuracy and proposes a new set of acoustic parameters based on high resolution wavelet analysis. The two parameter schemes are entitled wavelet packet parameters (WPP) and subband-based cepstral parameters (SBC). The performance of these features is compared to traditional Mel-frequency cepstral coefficients (MFCC) for stressed speech monophone recognition. The stressed speaking styles considered areneutral, angry, loud, and Lombard effect(1) speech from the SUSAS database. An overall monophone recognition improvement of 20.4% and 17.2% is achieved for loud and angry stressed speech, with a corresponding increase in the neutral monophone rate of 9.9% over MFCC parameters.
引用
收藏
页码:182 / 185
页数:4
相关论文
共 50 条
  • [1] Monophone-based connected word Hindi speech recognition improvement
    Bhatt S.
    Jain A.
    Dev A.
    Sadhana - Academy Proceedings in Engineering Sciences, 2021, 46 (02)
  • [2] Efficient data selection for speech recognition based on prior confidence estimation using speech and monophone models
    Kobashikawa, Satoshi
    Asami, Taichi
    Yamaguchi, Yoshikazu
    Masataki, Hirokazu
    Takahashi, Satoshi
    COMPUTER SPEECH AND LANGUAGE, 2014, 28 (06) : 1287 - 1297
  • [3] Speech feature extraction based on wavelet modulation scale for robust speech recognition
    Ma, Xin
    Zhou, Weidong
    Ju, Fang
    Jiang, Qi
    NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 499 - 505
  • [4] Acceleration of feature extraction for FPGA based speech recognition
    Arminas, Vytautas
    Tamulevicius, Gintautas
    Navakauskas, Dalius
    Ivanovas, Edgaras
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2010, 2010, 7745
  • [5] Optimizing feature extraction for speech recognition
    Lee, CH
    Hyun, DH
    Choi, ES
    Go, JW
    Lee, CY
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (01): : 80 - 87
  • [6] A SPEECH RECOGNITION METHOD BASED ON FEATURE DISTRIBUTIONS
    LIU, LC
    CHIOU, D
    WANG, HC
    PATTERN RECOGNITION, 1991, 24 (08) : 717 - 722
  • [7] Feature Extraction Based on Speech Attractors in the Reconstructed Phase Space for Automatic Speech Recognition Systems
    Shekofteh, Yasser
    Almasganj, Farshad
    ETRI JOURNAL, 2013, 35 (01) : 100 - 108
  • [8] Feature extraction based on auditory representations for robust speech recognition
    Kim, DS
    Lee, SY
    Kil, RM
    Zhu, XL
    ELECTRONICS LETTERS, 1997, 33 (01) : 15 - 16
  • [9] A Study on Speech Recognition by a Neural Network Based on English Speech Feature Parameters
    Mao, Congmin
    Liu, Sujing
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2024, 28 (03) : 679 - 684
  • [10] A Subspace Projection Based Approach to Improve the Recognition of Stressed Speech
    Priya, Bhanu
    Dandapat, S.
    2016 IEEE ANNUAL INDIA CONFERENCE (INDICON), 2016,