A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR

被引:0
作者
Wang, Syu-Siang [1 ]
Hung, Jeih-Weih [2 ]
Tsao, Yu [1 ]
机构
[1] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 115, Taiwan
[2] Natl Chi Nan Univ, Dept Elect Engn, Nantou, Taiwan
来源
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING | 2012年
关键词
discrete wavelet transform; CMS; CMVN; RASTA; noise robust; speech recognition; SPEECH; NOISE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a cepstral subband normalization (CSN) approach for robust speech recognition. The CSN approach first applies the discrete wavelet transform (DWT) to decompose the original cepstral feature sequence into low and high frequency band (LFB and HFB) parts. Then, CSN normalizes the LFB components and zeros out the HFB components. Finally, an inverse DWT is applied on LFB and HFB components to form the normalized cepstral features. When using the Haar functions as the DWT bases, the calculation of CSN can be processed efficiently with a 50% reduction on the amount of feature components. In addition, our experimental results on the Aurora-2 task show that CSN outperforms the conventional cepstral mean subtraction (CMS), cepstral mean and variance normalization (CMVN), and histogram equalization (HEQ). We also integrate CSN with advanced front-end (AFE) for feature extraction. Experimental results indicate that the integrated AFE+CSN achieves notable improvements over the original AFE. The simple calculation, compact in form, and effective noise robustness properties enable CSN to perform suitably for mobile applications.
引用
收藏
页码:141 / 145
页数:5
相关论文
共 50 条
  • [41] Classification of Sleep Apnea through Sub-band Energy of Abdominal Effort Signal Using Wavelets + Neural Networks
    M. Emin Tagluk
    Necmettin Sezgin
    Journal of Medical Systems, 2010, 34 : 1111 - 1119
  • [42] Speech enhancement using sub-band cross-correlation compensated Wiener filter combined with harmonic regeneration
    Rao, Ch. V. Rama
    Murthy, M. B. Rama
    Rao, K. Srinivasa
    AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2012, 66 (06) : 459 - 464
  • [43] Acoustic Feature Extraction using ERB Like Wavelet Sub-band Perceptual Wiener Filtering for Noisy Speech Recognition
    Biswas, Astik
    Sahu, P. K.
    Bhowmick, Anirban
    Chandra, Mahesh
    2014 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2014,
  • [44] Classification of Sleep Apnea through Sub-band Energy of Abdominal Effort Signal Using Wavelets plus Neural Networks
    Tagluk, M. Emin
    Sezgin, Necmettin
    JOURNAL OF MEDICAL SYSTEMS, 2010, 34 (06) : 1111 - 1119
  • [45] Speech Recognition using ERB-like Admissible Wavelet Packet Decomposition based on Perceptual sub-band Weighting
    Biswas, Astik
    Sahu, P. K.
    Bhowmick, Anirban
    Chandra, Mahesh
    IETE JOURNAL OF RESEARCH, 2016, 62 (02) : 129 - 139
  • [46] Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm
    Firoozabadi, Ali Dehghan
    Irarrazaval, Pablo
    Adasme, Pablo
    Zabala-Blanco, David
    Durney, Hugo
    Sanhueza, Miguel
    Palacios-Jativa, Pablo
    Azurdia-Meza, Cesar
    APPLIED SCIENCES-BASEL, 2020, 10 (11):
  • [47] RETRACTED: Powerful basic frequency extraction from monophonic signs utilizing versatile sub-band separating (Retracted Article)
    Loheswaran, K.
    Subba Ramaiah, V.
    Srinivasa Rao, Sirasani
    Malathi, P.
    Prabu, M.
    Niveditha, V. R.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 25 (Suppl 1) : 13 - 13
  • [48] A sub-band-based feature reconstruction approach for robust speaker recognition
    Yan, Furong
    Zhang, Yanbin
    Yan, Jiachang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014, : 1 - 13
  • [49] Target detection from TVSS shallow water data using a multi-channel sub-band adaptive filtering scheme
    AzimiSadjadi, MR
    Yuan, C
    Hasan, M
    Wilbur, J
    Dobeck, G
    DETECTION AND REMEDIATION TECHNOLOGIES FOR MINES AND MINELIKE TARGETS II, 1997, 3079 : 36 - 47
  • [50] Admissible wavelet packet sub-band based harmonic energy features using ANOVA fusion techniques for Hindi phoneme recognition
    Biswas, Astik
    Sahu, P. K.
    Chandra, Mahesh
    IET SIGNAL PROCESSING, 2016, 10 (08) : 902 - 911