Hard-Mask Missing Feature Theory for Robust Speaker Recognition

被引：4

作者：

Lim, Shin-Cheol ^{[1
]}

Jang, Sei-Jin ^{[2
]}

Lee, Soek-Pil ^{[2
]}

Kim, Moo Young ^{[1
]}

机构：

[1] Sejong Univ, Human Comp Interact Lab, Dept Informat & Commun Engn, Seoul, South Korea

[2] Korea Elect Technol Inst, Digital Media Res Ctr, Seoul, South Korea

来源：

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS | 2011年 / 57卷 / 03期

关键词：

Speaker recognition; missing feature theory; MFT; AMFT; NOISE;

D O I：

10.1109/TCE.2011.6018880

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Compared with conventional full-band speaker recognition systems, Advanced Missing Feature Theory (AMFT) produces a much lower error rate, but requires increased computational complexity. We propose a weighting function for the score calculation algorithm in AMFT. The weighting function is estimated by calculating the number of reliable spectral components. A modified mask is also proposed to reduce the number of reliable components based on the estimated weighting function. In the proposed Hard-mask MFT-8 (HMFT-8), only 8 elements are selected out of 10 spectral components in a feature vector. Compared with the full-band system and the AMFT, the proposed HMFT-8 gives a lower identification error rate by 16.95% and 2.67%, respectively. In terms of computational complexity, AMFT and HMFT-8 require 307 and 41 arithmetic and conditional operations for each frame, respectively.

引用

页码：1245 / 1250

页数：6

共 50 条

[21] Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition [J].

Wang, Jia-Ching ;

Wang, Chien-Yao ;

Chin, Yu-Hao ;

Liu, Yu-Ting ;

Chen, En-Ting ;

Chang, Pao-Chi .

MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) :4055-4068

[22] Noise-robust speaker recognition using subband likelihoods and reliable-feature selection [J].

Kim, Sungtak ;

Ji, Mikyong ;

Kim, Hoirin .

ETRI JOURNAL, 2008, 30 (01) :89-100

[23] X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION [J].

Snyder, David ;

Garcia-Romero, Daniel ;

Sell, Gregory ;

Povey, Daniel ;

Khudanpur, Sanjeev .

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, :5329-5333

[24] Multi-Noise Representation Learning for Robust Speaker Recognition [J].

Cho, Sunyoung ;

Wee, Kyungchul .

IEEE SIGNAL PROCESSING LETTERS, 2025, 32 :681-685

[25] Feature Extraction Methods for Speaker Recognition: A Review [J].

Chaudhary, Gopal ;

Srivastava, Smriti ;

Bhardwaj, Saurabh .

INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2017, 31 (12)

[26] A Novel Feature Extraction Methods for Speaker Recognition [J].

Zou, Muchun .

COMMUNICATIONS AND INFORMATION PROCESSING, PT 1, 2012, 288 :713-722

[27] AN EFFICIENT FEATURE SELECTION METHOD FOR SPEAKER RECOGNITION [J].

Sun, Hanwu ;

Ma, Bin ;

Li, Haizhou .

2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, :181-184

[28] Power Wavelet Cepstral Coefficients (PWCC): An Accurate Auditory Model-Based Feature Extraction Method for Robust Speaker Recognition [J].

Zouhir, Youssef ;

Zarka, Mohamed ;

Ouni, Kais ;

Amraoui, Lilia El .

IEEE ACCESS, 2025, 13 :102323-102338

[29] Influence of binary mask estimation errors on robust speaker identification [J].

May, Tobias .

SPEECH COMMUNICATION, 2017, 87 :40-48

[30] A Two Stage Mask Estimation Approach to Robust Speaker Verification [J].

Zhao, Yali ;

Xie, Lei ;

Fu, Zhonghua .

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, :2653-2656

← 1 2 3 4 5 →