Bhattacharyya and expected likelihood kernels

被引:64
作者
Jebara, T [1 ]
Kondor, R [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
来源
LEARNING THEORY AND KERNEL MACHINES | 2003年 / 2777卷
关键词
D O I
10.1007/978-3-540-45167-9_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new class of kernels between distributions. These induce a kernel on the input space between data points by associating to each datum a generative model fit to the data point individually. The kernel is then computed by integrating the product of the two generative models corresponding to two data points. This kernel permits discriminative estimation via, for instance, support vector machines, while exploiting the properties, assumptions, and invariances inherent in the choice of generative model. It satisfies Mercer's condition and can be computed in closed form for a large class of models, including exponential family models, mixtures, hidden Markov models and Bayesian networks. For other models the kernel can be approximated by sampling methods. Experiments are shown for multinomial models in text classification and for hidden Markov models for protein sequence classification.
引用
收藏
页码:57 / 71
页数:15
相关论文
共 26 条
[1]  
AHERNE F, 1997, KYBERNETIKA, V32, P1
[2]  
[Anonymous], 2005, NEURAL NETWORKS PATT
[3]  
[Anonymous], 1943, BULL CALCUTTA MATH S
[4]  
BARNDORFFNIELSE.O, 1978, INFORMATION EXPONENT
[5]   Input-output HMM's for sequence processing [J].
Bengio, Y ;
Frasconi, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (05) :1231-1249
[6]  
COLLINS M, 2002, NEURAL INFORMATION P, V14
[7]  
CORTES C, 2002, NEURAL INFORMATION P, V15
[8]   REGULARIZATION THEORY AND NEURAL NETWORKS ARCHITECTURES [J].
GIROSI, F ;
JONES, M ;
POGGIO, T .
NEURAL COMPUTATION, 1995, 7 (02) :219-269
[9]  
HAUSSLER D, 1999, UCSCCRL9910
[10]  
JAAKKOLA T, 1999, NEURAL INFORMATION P, V12