f-Divergence is a Generalized Invariant Measure Between Distributions

被引:0
作者
Qiao, Yu [1 ]
Minematsu, Nobuaki [1 ]
机构
[1] Univ Tokyo, Grad Sch Engn, Tokyo, Japan
来源
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年
关键词
f-divergence; invariant measure; invertible transformation; speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Finding measures (or features) invariant to inevitable variations caused by non-linguistical factors (transformations) is a fundamental yet important problem in speech recognition. Recently, Minematsu [1, 2] proved that Bhattacharyya distance (BD) between two distributions is invariant to invertible transforms on feature space, and develop an invariant structural representation of speech based on it. There is a question: which kind of measures can be invariant? In this paper, we prove that f-divergence yields a generalized family of invariant measures, and show that all the invariant measures have to be written in the forms of f-divergence. Many famous distances and divergences in information and statistics, such as Bhattacharyya distance (BD), KL-divergence, Hellinger distance, can be written into forms of f-divergence. As an application, we carried out experiments on recognizing the utterances of connected Japanese vowels. The experimental results show that BD and KL have the best performance among the measures compared.
引用
收藏
页码:1349 / 1352
页数:4
相关论文
共 17 条
[1]  
ALI SM, 1966, J ROY STAT SOC B, V28, P131
[2]  
[Anonymous], 2005, P INTERSPEECH
[3]  
Asakawa S., 2007, P 8 ANN C INT SPEECH, P890
[4]  
Asakawa S, 2008, INT CONF ACOUST SPEE, P4097
[5]  
Csiszar I., 1967, STUD SCI MATH HUNG, V2, P299
[6]  
Csiszar I., 2004, INFORM THEORY STAT T
[7]   Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains [J].
Gauvain, Jean-Luc ;
Lee, Chin-Hui .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :291-298
[8]   Approximating the Kullback Leibler Divergence between Gaussian Mixture Models [J].
Hershey, John R. ;
Olsen, Peder A. .
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, :317-320
[9]  
Kawahara T., 2004, P ICSLP, P3069
[10]   MAXIMUM-LIKELIHOOD LINEAR-REGRESSION FOR SPEAKER ADAPTATION OF CONTINUOUS DENSITY HIDDEN MARKOV-MODELS [J].
LEGGETTER, CJ ;
WOODLAND, PC .
COMPUTER SPEECH AND LANGUAGE, 1995, 9 (02) :171-185