Hard-Mask Missing Feature Theory for Robust Speaker Recognition

被引:4
作者
Lim, Shin-Cheol [1 ]
Jang, Sei-Jin [2 ]
Lee, Soek-Pil [2 ]
Kim, Moo Young [1 ]
机构
[1] Sejong Univ, Human Comp Interact Lab, Dept Informat & Commun Engn, Seoul, South Korea
[2] Korea Elect Technol Inst, Digital Media Res Ctr, Seoul, South Korea
关键词
Speaker recognition; missing feature theory; MFT; AMFT; NOISE;
D O I
10.1109/TCE.2011.6018880
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Compared with conventional full-band speaker recognition systems, Advanced Missing Feature Theory (AMFT) produces a much lower error rate, but requires increased computational complexity. We propose a weighting function for the score calculation algorithm in AMFT. The weighting function is estimated by calculating the number of reliable spectral components. A modified mask is also proposed to reduce the number of reliable components based on the estimated weighting function. In the proposed Hard-mask MFT-8 (HMFT-8), only 8 elements are selected out of 10 spectral components in a feature vector. Compared with the full-band system and the AMFT, the proposed HMFT-8 gives a lower identification error rate by 16.95% and 2.67%, respectively. In terms of computational complexity, AMFT and HMFT-8 require 307 and 41 arithmetic and conditional operations for each frame, respectively.
引用
收藏
页码:1245 / 1250
页数:6
相关论文
共 50 条
[21]   Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition [J].
Wang, Jia-Ching ;
Wang, Chien-Yao ;
Chin, Yu-Hao ;
Liu, Yu-Ting ;
Chen, En-Ting ;
Chang, Pao-Chi .
MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) :4055-4068
[22]   Noise-robust speaker recognition using subband likelihoods and reliable-feature selection [J].
Kim, Sungtak ;
Ji, Mikyong ;
Kim, Hoirin .
ETRI JOURNAL, 2008, 30 (01) :89-100
[23]   X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION [J].
Snyder, David ;
Garcia-Romero, Daniel ;
Sell, Gregory ;
Povey, Daniel ;
Khudanpur, Sanjeev .
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, :5329-5333
[24]   Multi-Noise Representation Learning for Robust Speaker Recognition [J].
Cho, Sunyoung ;
Wee, Kyungchul .
IEEE SIGNAL PROCESSING LETTERS, 2025, 32 :681-685
[25]   Feature Extraction Methods for Speaker Recognition: A Review [J].
Chaudhary, Gopal ;
Srivastava, Smriti ;
Bhardwaj, Saurabh .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2017, 31 (12)
[26]   A Novel Feature Extraction Methods for Speaker Recognition [J].
Zou, Muchun .
COMMUNICATIONS AND INFORMATION PROCESSING, PT 1, 2012, 288 :713-722
[27]   AN EFFICIENT FEATURE SELECTION METHOD FOR SPEAKER RECOGNITION [J].
Sun, Hanwu ;
Ma, Bin ;
Li, Haizhou .
2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, :181-184
[28]   Power Wavelet Cepstral Coefficients (PWCC): An Accurate Auditory Model-Based Feature Extraction Method for Robust Speaker Recognition [J].
Zouhir, Youssef ;
Zarka, Mohamed ;
Ouni, Kais ;
Amraoui, Lilia El .
IEEE ACCESS, 2025, 13 :102323-102338
[29]   Influence of binary mask estimation errors on robust speaker identification [J].
May, Tobias .
SPEECH COMMUNICATION, 2017, 87 :40-48
[30]   A Two Stage Mask Estimation Approach to Robust Speaker Verification [J].
Zhao, Yali ;
Xie, Lei ;
Fu, Zhonghua .
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, :2653-2656