High resolution speech feature parametrization for monophone-based stressed speech recognition

被引：43

作者：

Sarikaya, R ^{[1
]}

Hansen, JHL ^{[1
]}

机构：

[1] Univ Colorado, Ctr Spoken Language Res, Robust Speech Proc Lab, Boulder, CO 80309 USA

来源：

IEEE SIGNAL PROCESSING LETTERS | 2000年 / 7卷 / 07期

关键词：

feature extraction; speech recognition; speech under stress; wavelet analysis;

D O I：

10.1109/97.847363

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This letter investigates the impact of stress on monophone speech recognition accuracy and proposes a new set of acoustic parameters based on high resolution wavelet analysis. The two parameter schemes are entitled wavelet packet parameters (WPP) and subband-based cepstral parameters (SBC). The performance of these features is compared to traditional Mel-frequency cepstral coefficients (MFCC) for stressed speech monophone recognition. The stressed speaking styles considered areneutral, angry, loud, and Lombard effect(1) speech from the SUSAS database. An overall monophone recognition improvement of 20.4% and 17.2% is achieved for loud and angry stressed speech, with a corresponding increase in the neutral monophone rate of 9.9% over MFCC parameters.

引用

页码：182 / 185

页数：4

共 50 条

[1] Monophone-based connected word Hindi speech recognition improvement
Bhatt S.
Jain A.
Dev A.
Sadhana - Academy Proceedings in Engineering Sciences, 2021, 46 (02)
[2] Efficient data selection for speech recognition based on prior confidence estimation using speech and monophone models
Kobashikawa, Satoshi
Asami, Taichi
Yamaguchi, Yoshikazu
Masataki, Hirokazu
Takahashi, Satoshi
COMPUTER SPEECH AND LANGUAGE, 2014, 28 (06) : 1287 - 1297
[3] Speech feature extraction based on wavelet modulation scale for robust speech recognition
Ma, Xin
Zhou, Weidong
Ju, Fang
Jiang, Qi
NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 499 - 505
[4] Acceleration of feature extraction for FPGA based speech recognition
Arminas, Vytautas
Tamulevicius, Gintautas
Navakauskas, Dalius
Ivanovas, Edgaras
PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2010, 2010, 7745
[5] Optimizing feature extraction for speech recognition
Lee, CH
Hyun, DH
Choi, ES
Go, JW
Lee, CY
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (01): : 80 - 87
[6] A SPEECH RECOGNITION METHOD BASED ON FEATURE DISTRIBUTIONS
LIU, LC
CHIOU, D
WANG, HC
PATTERN RECOGNITION, 1991, 24 (08) : 717 - 722
[7] Feature Extraction Based on Speech Attractors in the Reconstructed Phase Space for Automatic Speech Recognition Systems
Shekofteh, Yasser
Almasganj, Farshad
ETRI JOURNAL, 2013, 35 (01) : 100 - 108
[8] Feature extraction based on auditory representations for robust speech recognition
Kim, DS
Lee, SY
Kil, RM
Zhu, XL
ELECTRONICS LETTERS, 1997, 33 (01) : 15 - 16
[9] A Study on Speech Recognition by a Neural Network Based on English Speech Feature Parameters
Mao, Congmin
Liu, Sujing
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2024, 28 (03) : 679 - 684
[10] A Subspace Projection Based Approach to Improve the Recognition of Stressed Speech
Priya, Bhanu
Dandapat, S.
2016 IEEE ANNUAL INDIA CONFERENCE (INDICON), 2016,

← 1 2 3 4 5 →