Supervector Dimension Reduction for Efficient Speaker Age Estimation Based on the Acoustic Speech Signal

被引:45
作者
Dobry, Gil [1 ]
Hecht, Ron M. [2 ]
Avigal, Mireille [1 ]
Zigel, Yaniv [3 ]
机构
[1] Open Univ Israel, Dept Comp Sci, IL-43721 Raanana, Israel
[2] Gen Motors Adv Tech Ctr, Human Machine Interface Grp, IL-46725 Herzliyya, Israel
[3] Ben Gurion Univ Negev, Dept Biomed Engn, IL-84105 Beer Sheva, Israel
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2011年 / 19卷 / 07期
关键词
Age estimation; dimension reduction; Gaussian mixture model (GMM) supervector; support vector machine (SVM); support vector regression;
D O I
10.1109/TASL.2011.2104955
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel dimension reduction method which aims to improve the accuracy and the efficiency of speaker's age estimation systems based on speech signal. Two different age estimation approaches were studied and implemented; the first, age-group classification, and the second, precise age estimation using regression. These two approaches use the Gaussian mixture model (GMM) supervectors as features for a support vector machine (SVM) model. When a radial basis function (RBF) kernel is used, the accuracy is improved compared to using a linear kernel; however, the computation complexity is more sensitive to the feature dimension. Classic dimension reduction methods like principal component analysis (PCA) and linear discriminant analysis (LDA) tend to eliminate the relevant feature information and cannot always be applied without damaging the model's accuracy. In our study, a novel dimension reduction method was developed, the weighted-pairwise principal components analysis (WPPCA) based on the nuisance attribute projection (NAP) technique. This method projects the supervectors to a reduced space where the redundant within-class pairwise variability is eliminated. This method was applied and compared to the baseline system where no dimensionality reduction is done on the supervectors. The conducted experiments showed a dramatic speed-up in the SVM training testing time using reduced feature vectors. The system accuracy was improved by 5% for the classification system and by 10% for the regression system using the proposed dimension reduction method.
引用
收藏
页码:1975 / 1985
页数:11
相关论文
共 22 条
[1]  
[Anonymous], ODYSSEY 2006
[2]  
[Anonymous], 2002, CAMBRIDGE U ENG DEP
[3]  
[Anonymous], P EUROSPEECH GEN
[4]  
[Anonymous], 1990, SUPPORT VECTOR LEARN
[5]   Prediction by supervised principal components [J].
Bair, E ;
Hastie, T ;
Paul, D ;
Tibshirani, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) :119-137
[6]   Age and gender recognition for telephone applications based on GMM supervectors and support vector machines [J].
Bocklet, Tobias ;
Maier, Andreas ;
Bauer, Josef G. ;
Burkhardt, Felix ;
Noeth, Elmar .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :1605-+
[7]  
Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
[8]  
Chih-Chung C., 2001, LIBSVM: a library for support vector machines
[9]   SVMTorch: Support vector machines for large-scale regression problems [J].
Collobert, R ;
Bengio, S .
JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (02) :143-160
[10]  
Dehak R., 2007, P INT ANTW BELG