Gender and Affect Recognition Based on GMM and GMM-UBM modeling with relevance MAP estimation

被引:0
作者
Gajsek, Rok [1 ]
Zibert, Janez [2 ]
Justin, Tadej [1 ]
Struc, Vitomir [1 ]
Vesnicer, Bostjan [3 ]
Mihelic, France [1 ]
机构
[1] Univ Ljubljana, Fac Elect Engn, Ljubljana 61000, Slovenia
[2] Univ Primorska, Dept Informat Sci & Technol, Primorska, Slovenia
[3] Alpineon Res & Dev, Ljubljana, Slovenia
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年
关键词
emotion recognition; affect recognition; gender recognition; GMM-UBM; MAP;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The paper presents our efforts in the Gender Sub-Challenge and the Affect Sub-Challenge of the INTERSPEECH 2010 Paralinguistic Challenge. The system for the Gender Sub-Challenge is based on modeling the Mel-Frequency Cepstrum Coefficients using Gaussian mixture models, building a separate model for each of the gender categories. For the Affect Sub-Challenge we propose a modeling schema where a universal background model is first trained an all the training data and then, employing the maximum a posteriori estimation criteria, a new feature vector of means is produced for each particular sample. The feature set used is comprised of low level descriptors from the baseline system, which in our case are split into four sub-sets, and modeled by its own model. Predictions from all sub-systems are fused using the sum rule fusion. Aside from the baseline regression procedure, we also evaluated the Support Vector Regression and compared the performance. Both systems achieve higher recognition results on the development set compared to baseline, but in the Affect Sub-Challenge our system's cross correlation is lower than that of the baseline system, although the mean linear error is slightly superior. In the Gender Sub-Challenge the unweighted average recall on the test set is 82.84%, and for the Affect Sub-Challenge the cross-correlation on the test set is 0.39 with mean linear error of 0.143.
引用
收藏
页码:2814 / +
页数:2
相关论文
共 50 条
[41]   COMPARISON OF USER MODELS BASED ON GMM-UBM AND I-VECTORS FOR SPEECH, HANDWRITING, AND GAIT ASSESSMENT OF PARKINSON'S DISEASE PATIENTS [J].
Vasquez-Correa, J. C. ;
Bocklet, T. ;
Orozco-Arroyave, J. R. ;
Noeth, E. .
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, :6544-6548
[42]   Forensic evidence reporting using GMM-UBM, JFA and I-vector methods : Application to Algerian Arabic dialect [J].
Boulkenafet, Z. ;
Bengherabi, M. ;
Harizi, F. ;
Nouali, Omar ;
Mohamed, Cheriet .
2013 8TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS (ISPA), 2013, :404-+
[43]   一种改进的基于GMM-UBM的法庭自动说话人识别系统 [J].
王华朋 ;
杨军 ;
吴鸣 ;
许勇 .
中国科学院大学学报, 2013, 30 (06) :800-805
[44]   基于超音段韵律特征和GMM-UBM的文本无关的说话人识别 [J].
许东星 ;
戴蓓蒨 ;
刘青松 ;
许敏强 .
中国科学技术大学学报, 2010, 40 (02) :157-162
[45]   Gender-based speaker recognition from speech signals using GMM model [J].
Gupta, Manish ;
Bhartit, Shambhu Shankar ;
Agarwal, Suneeta .
MODERN PHYSICS LETTERS B, 2019, 33 (35)
[46]   Improving the Performance of Far-Field Speaker Verification Using Multi-Condition Training: The Case of GMM-UBM and i-vector Systems [J].
Avila, Anderson R. ;
Sarria-Paja, Milton ;
Fraga, Francisco J. ;
O'Shaughnessy, Douglas ;
Falk, Tiago H. .
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, :1096-1100
[47]   COMPARISON OF ADAPTATION METHODS FOR GMM-SVM BASED SPEECH EMOTION RECOGNITION [J].
Jiang, Jianbo ;
Wu, Zhiyong ;
Xu, Mingxing ;
Jia, Jia ;
Cai, Lianhong .
2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, :269-273
[48]   A comparison of procedures for the calculation of forensic likelihood ratios from acoustic-phonetic data Multivariate kernel density (MVKD) versus Gaussian mixture model-universal background model (GMM-UBM) [J].
Morrison, Geoffrey Stewart .
SPEECH COMMUNICATION, 2011, 53 (02) :242-256
[49]   An Innovative Method for Speech Signal Emotion Recognition Based on Spectral Features Using GMM and HMM Techniques [J].
Mohammed Jawad Al-Dujaili Al-Khazraji ;
Abbas Ebrahimi-Moghadam .
Wireless Personal Communications, 2024, 134 :735-753
[50]   An Innovative Method for Speech Signal Emotion Recognition Based on Spectral Features Using GMM and HMM Techniques [J].
Al-Khazraji, Mohammed Jawad Al-Dujaili ;
Ebrahimi-Moghadam, Abbas .
WIRELESS PERSONAL COMMUNICATIONS, 2024, 134 (02) :735-753