Automatic Speech Recognition Based on Non-Uniform Error Criteria

被引:5
作者
Fu, Qiang [1 ]
Zhao, Yong [2 ]
Juang, Biing-Hwang [2 ]
机构
[1] Fetch Technol, El Segundo, CA 90245 USA
[2] Georgia Inst Technol, Dept Elect & Comp Engn, Atlanta, GA 30332 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2012年 / 20卷 / 03期
关键词
Non-uniform error cost; minimum classification error cost (MCEC); MINIMUM CLASSIFICATION ERROR;
D O I
10.1109/TASL.2011.2165279
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Bayes decision theory is the foundation of the classical statistical pattern recognition approach, with the expected error as the performance objective. For most pattern recognition problems, the "error" is conventionally assumed to be binary, i.e., 0 or 1, equivalent to error counting, independent of the specifics of the error made by the system. The term "error rate" is thus long considered the prevalent system performance measure. This performance measure, nonetheless, may not be satisfactory in many practical applications. In automatic speech recognition, for example, it is well known that some errors are more detrimental (e.g., more likely to lead to misunderstanding of the spoken sentence) than others. In this paper, we propose an extended framework for the speech recognition problem with non-uniform classification/recognition error cost which can be controlled by the system designer. In particular, we address the issue of system model optimization when the cost of a recognition error is class dependent. We formulate the problem in the framework of the minimum classification error (MCE) method, after appropriate generalization to integrate the class-dependent error cost into one consistent objective function for optimization. We present a variety of training scenarios for automatic speech recognition under this extended framework. Experimental results for continuous speech recognition are provided to demonstrate the effectiveness of the new approach.
引用
收藏
页码:780 / 793
页数:14
相关论文
共 30 条
[1]  
[Anonymous], PATTERN CLASSIFICATI
[2]  
CHOU W, ADAPTIVE DISCRIMINAT
[3]   Lattice segmentation and minimum Bayes risk discriminative training for large vocabulary continuous speech recognition [J].
Doumpiotis, V ;
Byrne, W .
SPEECH COMMUNICATION, 2006, 48 (02) :142-160
[4]  
Doumpiotis V., 2004, P ICSLP
[5]  
Du J., 2006, ICSLP 06 PITTSB PA S
[6]   Empirical System Learning for Statistical Pattern Recognition With Non-Uniform Error Criteria [J].
Fu, Qiang ;
Mansjur, Dwi Sianto ;
Juang, Biing-Hwang .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2010, 58 (09) :4621-4633
[7]  
GIBSON M, 2006, P INT
[8]   Minimum Bayes-risk automatic speech recognition [J].
Goel, V ;
Byrne, WJ .
COMPUTER SPEECH AND LANGUAGE, 2000, 14 (02) :115-135
[9]  
Gordon S, 2003, NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, P370
[10]  
HE X, 2008, IEEE SIGNAL PROC SEP, P14