Learning stochastic tree edit distance

被引:0
作者
Bernard, Marc
Habrard, Amaury
Sebban, Marc
机构
[1] Univ St Etienne, EURISE, F-42023 St Etienne 2, France
[2] Univ Aix Marseille 1, LIF, F-13453 Marseille 13, France
来源
MACHINE LEARNING: ECML 2006, PROCEEDINGS | 2006年 / 4212卷
关键词
stochastic tree edit distance; EM algorithm; generative models; discriminative models;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Trees provide a suited structural representation to deal with complex tasks such as web information extraction, RNA secondary structure prediction, or conversion of tree structured documents. In this context, many applications require the calculation of similarities between tree pairs. The most studied distance is likely the tree edit distance (ED) for which improvements in terms of complexity have been achieved during the last decade. However, this classic ED usually uses a priori fixed edit costs which are often difficult to tune, that leaves little room for tackling complex problems. In this paper, we focus on the learning of a stochastic tree ED. We use an adaptation of the Expectation-Maximization algorithm for learning the primitive edit costs. We carried out series of experiments that confirm the interest to learn a tree ED rather than a priori imposing edit costs.
引用
收藏
页码:42 / 53
页数:12
相关论文
共 11 条
[1]   A survey on tree edit distance and related problems [J].
Bille, P .
THEORETICAL COMPUTER SCIENCE, 2005, 337 (1-3) :217-239
[2]  
BOUCHARD G, 2004, COMPSTAT 2004
[3]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[4]  
Durbin R., 1998, BIOL SEQUENCE ANAL
[5]  
KLEIN P, 1998, P 6 EUR S ALG, P91
[6]  
MCCALLUM A, 2005, UAI2005
[7]   A Probabilistic approach to learning costs for graph edit distance [J].
Neuhaus, M ;
Bunke, H .
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, 2004, :389-393
[8]  
ONCINA J, 2006, IN PRESS J PATTERN R
[9]   Learning string-edit distance [J].
Ristad, ES ;
Yianilos, PN .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (05) :522-532
[10]   TREE-TO-TREE EDITING PROBLEM [J].
SELKOW, SM .
INFORMATION PROCESSING LETTERS, 1977, 6 (06) :184-186