Phoneme recognition using an adaptive supervised manifold learning algorithm

被引:1
作者
Zhao, Xiaoming [2 ]
Zhang, Shiqing [1 ]
机构
[1] Taizhou Univ, Sch Phys & Elect Engn, Taizhou 318000, Peoples R China
[2] Taizhou Univ, Dept Comp Sci, Taizhou 318000, Peoples R China
关键词
Dimensionality reduction; Manifold learning; Locally linear embedding; Phoneme recognition; FEATURE-EXTRACTION; SPEECH RECOGNITION; DIMENSIONALITY REDUCTION; REPRESENTATIONS;
D O I
10.1007/s00521-012-1032-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To effectively handle speech data lying on a nonlinear manifold embedded in a high-dimensional acoustic space, in this paper, an adaptive supervised manifold learning algorithm based on locally linear embedding (LLE) for nonlinear dimensionality reduction is proposed to extract the low-dimensional embedded data representations for phoneme recognition. The proposed method aims to make the interclass dissimilarity maximized, while the intraclass dissimilarity minimized in order to promote the discriminating power and generalization ability of the low-dimensional embedded data representations. The performance of the proposed method is compared with five well-known dimensionality reduction methods, i.e., principal component analysis, linear discriminant analysis, isometric mapping (Isomap), LLE as well as the original supervised LLE. Experimental results on three benchmarking speech databases, i.e., the Deterding database, the DARPA TIMIT database, and the ISOLET E-set database, demonstrate that the proposed method obtains promising performance on the phoneme recognition task, outperforming the other used methods.
引用
收藏
页码:1501 / 1515
页数:15
相关论文
共 51 条
[1]  
AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
[2]  
[Anonymous], CLADAG 2005 PARM IT
[3]  
Bengio Y, 2004, ADV NEUR IN, V16, P177
[4]   Locally linear metric adaptation with application to semi-supervised clustering and image retrieval [J].
Chang, Hong ;
Yeung, Dit-Yan .
PATTERN RECOGNITION, 2006, 39 (07) :1253-1264
[5]  
Cole RA, 1990, 90004 OR GRAD I COMP
[6]   COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].
DAVIS, SB ;
MERMELSTEIN, P .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366
[7]  
de Ridder D, 2003, LECT NOTES COMPUT SC, V2714, P333
[8]  
De Ridder D., 2002, PH200201 DELFT U TEC, P1
[9]  
Deterding D., 1989, Speaker normalization for automatic speech recognition
[10]   Robust feature extraction for continuous speech recognition using the MVDR spectrum estimation method [J].
Dharanipragada, Satya ;
Yapanel, Umit H. ;
Rao, Bhaskar D. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01) :224-234