NOISE AND SPEAKER COMPENSATION IN THE LOG FILTER BANK DOMAIN

被引:0
|
作者
Joshi, Vikas [1 ]
Bilgi, Raghavendra [1 ]
Umesh, S. [1 ]
Garcia, L. [2 ]
Benitez, C. [2 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
[2] Univ Granada, Dept Signal Theory Telemat & Commun, E-18071 Granada, Spain
关键词
Speaker Normalization; Noise Compensation; VTS; TVTLN; Noise and Speaker compensation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a method to compensate for noise and speaker-variability directly in the Log filter-bank (FB) domain, so that MFCC features are robust to noise and speaker-variations. For noise-compensation, we use Vector Taylor Series (VTS) approach in the Log FB domain, and speaker-normalization is also done in the Log FB domain using Linear Vocal tract length (VTLN) matrices. For VTLN, optimal selection of warp-factor is done in Log FB domain using canonical GMM model, avoiding the two-pass approach needed by a HMM model. Further, this can be efficiently implemented using sufficient statistics obtained from the GMM and the FB-VTLN-matrices. The warp-factor selection using GMM can also be done in cepstral domain by applying DCT matrices without the usual approximations associated with conventional linear-VTLN. The elegance of the proposed approach is that given the speech data, we obtain directly MFCC features that are robust to noise and speaker-variations. The proposed approach, show a significant relative improvement of 31% over baseline on Aurora-4 task.
引用
收藏
页码:4709 / 4712
页数:4
相关论文
共 50 条
  • [21] Spectral Domain Spline Graph Filter Bank
    Miraki, Amir
    Saeedi-Sourck, Hamid
    Marchetti, Nicola
    Farhang, Arman
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 469 - 473
  • [22] Low-Power Log-Domain CMOS Filter Bank for 2-D Sound Source Localization
    Ivan Grech
    Joseph Micallef
    Tanya Vladimirova
    Analog Integrated Circuits and Signal Processing, 2003, 36 : 99 - 117
  • [23] Speaker identification in mismatch condition using warped filter bank features
    Chavan, Mahesh S
    Chougule, Sharada V
    International Journal of Circuits, Systems and Signal Processing, 2015, 9 : 88 - 93
  • [24] Cross-domain variation compensation for robust speaker verification
    Huang, Houjun
    Zhou, Ruohua
    Yan, Yonghong
    ELECTRONICS LETTERS, 2015, 51 (21) : 1706 - 1707
  • [25] Compensation for domain mismatch in text-independent speaker recognition
    Bahmaninezhad, Fahimeh
    Hansen, John H. L.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1071 - 1075
  • [26] Multi filter bank approach for speaker verification based on genetic algorithm
    Charbuillet, Christophe
    Gas, Bruno
    Chetouani, Mohamed
    Zarader, Jean Luc
    ADVANCES IN NONLINEAR SPEECH PROCESSING, 2007, 4885 : 105 - 113
  • [27] Domain compensation based on phonetically discriminative features for speaker verification
    Long, Yanhua
    Ye, Hong
    Ni, Jifeng
    COMPUTER SPEECH AND LANGUAGE, 2017, 41 : 161 - 179
  • [28] Domain Mismatch Compensation for Speaker Recognition Using a Library of Whiteners
    Singer, Elliot
    Reynolds, Douglas A.
    IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (11) : 2000 - 2003
  • [29] MICROPOWER LOG-DOMAIN FILTER FOR ELECTRONIC COCHLEA
    TOUMAZOU, C
    NGARMNIL, J
    LANDE, TS
    ELECTRONICS LETTERS, 1994, 30 (22) : 1839 - 1841
  • [30] Unifying perspective on log-domain filter synthesis
    Frey, D. R.
    Drakakis, E. M.
    ELECTRONICS LETTERS, 2009, 45 (17) : 861 - U10