NOISE AND SPEAKER COMPENSATION IN THE LOG FILTER BANK DOMAIN

被引:0
|
作者
Joshi, Vikas [1 ]
Bilgi, Raghavendra [1 ]
Umesh, S. [1 ]
Garcia, L. [2 ]
Benitez, C. [2 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
[2] Univ Granada, Dept Signal Theory Telemat & Commun, E-18071 Granada, Spain
关键词
Speaker Normalization; Noise Compensation; VTS; TVTLN; Noise and Speaker compensation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a method to compensate for noise and speaker-variability directly in the Log filter-bank (FB) domain, so that MFCC features are robust to noise and speaker-variations. For noise-compensation, we use Vector Taylor Series (VTS) approach in the Log FB domain, and speaker-normalization is also done in the Log FB domain using Linear Vocal tract length (VTLN) matrices. For VTLN, optimal selection of warp-factor is done in Log FB domain using canonical GMM model, avoiding the two-pass approach needed by a HMM model. Further, this can be efficiently implemented using sufficient statistics obtained from the GMM and the FB-VTLN-matrices. The warp-factor selection using GMM can also be done in cepstral domain by applying DCT matrices without the usual approximations associated with conventional linear-VTLN. The elegance of the proposed approach is that given the speech data, we obtain directly MFCC features that are robust to noise and speaker-variations. The proposed approach, show a significant relative improvement of 31% over baseline on Aurora-4 task.
引用
收藏
页码:4709 / 4712
页数:4
相关论文
共 50 条
  • [1] Telephone Channel Compensation in Speaker Verification Using a Polynomial Approximation in the Log-Filter-Bank Energy Domain
    Garreton, Claudio
    Becerra Yoma, Nestor
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 336 - 341
  • [2] Noise Robust Speaker Identification Using RASTA–MFCC Feature with Quadrilateral Filter Bank Structure
    S. Selva Nidhyananthan
    R. Shantha Selva Kumari
    T. Senthur Selvi
    Wireless Personal Communications, 2016, 91 : 1321 - 1333
  • [3] Noise Robust Speaker Identification Using RASTA-MFCC Feature with Quadrilateral Filter Bank Structure
    Nidhyananthan, S. Selva
    Kumari, R. Shantha Selva
    Selvi, T. Senthur
    WIRELESS PERSONAL COMMUNICATIONS, 2016, 91 (03) : 1321 - 1333
  • [4] Analysis and compensation of log-domain biquadratic filter response deviations due to transistor nonidealities
    Leung, Vincent W.
    Roberts, Gordon W.
    Analog Integrated Circuits and Signal Processing, 2000, 22 (02) : 147 - 162
  • [5] Analysis and Compensation of Log-Domain Biquadratic Filter Response Deviations due to Transistor Nonidealities
    Vincent W. Leung
    Gordon W. Roberts
    Analog Integrated Circuits and Signal Processing, 2000, 22 : 147 - 162
  • [6] Analysis and compensation of log-domain biquadratic filter response deviations due to transistor nonidealities
    Leung, VW
    Roberts, GW
    ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING, 2000, 22 (2-3) : 147 - 162
  • [7] Temperature Independent Log Domain Filter
    Thanapitak, Surachoke
    Kirawanich, Phumin
    Wilairat, Decha
    Sedtheethorn, Pongsathorn
    2013 13TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT): COMMUNICATION AND INFORMATION TECHNOLOGY FOR NEW LIFE STYLE BEYOND THE CLOUD, 2013, : 257 - 260
  • [8] A micropower log domain FGMOS filter
    Rodríguez-Villegas, EO
    Rueda, A
    Yúfera, A
    2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL III, PROCEEDINGS, 2002, : 317 - 320
  • [9] Chemical log-domain filter
    Georgiou, P.
    Toumazou, C.
    ELECTRONICS LETTERS, 2009, 45 (08) : 391 - U73
  • [10] Filter bank Based Cepstral Features for Speaker Recognition
    Chougule, Sharada V.
    Chavan, Mahesh S.
    Gaikwad, M. S.
    2014 IEEE GLOBAL CONFERENCE ON WIRELESS COMPUTING AND NETWORKING (GCWCN), 2014, : 102 - 106