A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR

被引：0

作者：

Wang, Syu-Siang ^{[1
]}

Hung, Jeih-Weih ^{[2
]}

Tsao, Yu ^{[1
]}

机构：

[1] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 115, Taiwan

[2] Natl Chi Nan Univ, Dept Elect Engn, Nantou, Taiwan

来源：

2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING | 2012年

关键词：

discrete wavelet transform; CMS; CMVN; RASTA; noise robust; speech recognition; SPEECH; NOISE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a cepstral subband normalization (CSN) approach for robust speech recognition. The CSN approach first applies the discrete wavelet transform (DWT) to decompose the original cepstral feature sequence into low and high frequency band (LFB and HFB) parts. Then, CSN normalizes the LFB components and zeros out the HFB components. Finally, an inverse DWT is applied on LFB and HFB components to form the normalized cepstral features. When using the Haar functions as the DWT bases, the calculation of CSN can be processed efficiently with a 50% reduction on the amount of feature components. In addition, our experimental results on the Aurora-2 task show that CSN outperforms the conventional cepstral mean subtraction (CMS), cepstral mean and variance normalization (CMVN), and histogram equalization (HEQ). We also integrate CSN with advanced front-end (AFE) for feature extraction. Experimental results indicate that the integrated AFE+CSN achieves notable improvements over the original AFE. The simple calculation, compact in form, and effective noise robustness properties enable CSN to perform suitably for mobile applications.

引用

页码：141 / 145

页数：5

共 50 条

[41] Classification of Sleep Apnea through Sub-band Energy of Abdominal Effort Signal Using Wavelets + Neural Networks
M. Emin Tagluk
Necmettin Sezgin
Journal of Medical Systems, 2010, 34 : 1111 - 1119
[42] Speech enhancement using sub-band cross-correlation compensated Wiener filter combined with harmonic regeneration
Rao, Ch. V. Rama
Murthy, M. B. Rama
Rao, K. Srinivasa
AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2012, 66 (06) : 459 - 464
[43] Acoustic Feature Extraction using ERB Like Wavelet Sub-band Perceptual Wiener Filtering for Noisy Speech Recognition
Biswas, Astik
Sahu, P. K.
Bhowmick, Anirban
Chandra, Mahesh
2014 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2014,
[44] Classification of Sleep Apnea through Sub-band Energy of Abdominal Effort Signal Using Wavelets plus Neural Networks
Tagluk, M. Emin
Sezgin, Necmettin
JOURNAL OF MEDICAL SYSTEMS, 2010, 34 (06) : 1111 - 1119
[45] Speech Recognition using ERB-like Admissible Wavelet Packet Decomposition based on Perceptual sub-band Weighting
Biswas, Astik
Sahu, P. K.
Bhowmick, Anirban
Chandra, Mahesh
IETE JOURNAL OF RESEARCH, 2016, 62 (02) : 129 - 139
[46] Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm
Firoozabadi, Ali Dehghan
Irarrazaval, Pablo
Adasme, Pablo
Zabala-Blanco, David
Durney, Hugo
Sanhueza, Miguel
Palacios-Jativa, Pablo
Azurdia-Meza, Cesar
APPLIED SCIENCES-BASEL, 2020, 10 (11):
[47] RETRACTED: Powerful basic frequency extraction from monophonic signs utilizing versatile sub-band separating (Retracted Article)
Loheswaran, K.
Subba Ramaiah, V.
Srinivasa Rao, Sirasani
Malathi, P.
Prabu, M.
Niveditha, V. R.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 25 (Suppl 1) : 13 - 13
[48] A sub-band-based feature reconstruction approach for robust speaker recognition
Yan, Furong
Zhang, Yanbin
Yan, Jiachang
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014, : 1 - 13
[49] Target detection from TVSS shallow water data using a multi-channel sub-band adaptive filtering scheme
AzimiSadjadi, MR
Yuan, C
Hasan, M
Wilbur, J
Dobeck, G
DETECTION AND REMEDIATION TECHNOLOGIES FOR MINES AND MINELIKE TARGETS II, 1997, 3079 : 36 - 47
[50] Admissible wavelet packet sub-band based harmonic energy features using ANOVA fusion techniques for Hindi phoneme recognition
Biswas, Astik
Sahu, P. K.
Chandra, Mahesh
IET SIGNAL PROCESSING, 2016, 10 (08) : 902 - 911

← 1 2 3 4 5 →