Syntactical Similarity Learning by Means of Grammatical Evolution

被引:11
作者
Bartoli, Alberto [1 ]
De Lorenzo, Andrea [1 ]
Medvet, Eric [1 ]
Tarlao, Fabiano [1 ]
机构
[1] Univ Trieste, Dept Engn & Architecture, Trieste, Italy
来源
PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XIV | 2016年 / 9921卷
关键词
Distance learning; Entity extraction; String patterns;
D O I
10.1007/978-3-319-45823-6_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several research efforts have shown that a similarity function synthesized from examples may capture an application-specific similarity criterion in a way that fits the application needs more effectively than a generic distance definition. In this work, we propose a similarity learning algorithm tailored to problems of syntax-based entity extraction from unstructured text streams. The algorithm takes in input pairs of strings along with an indication of whether they adhere or not adhere to the same syntactic pattern. Our approach is based on Grammatical Evolution and explores systematically a similarity definition space including all functions that may be expressed with a specialized, simple language that we have defined for this purpose. We assessed our proposal on patterns representative of practical applications. The results suggest that the proposed approach is indeed feasible and that the learned similarity function is more effective than the Levenshtein distance and the Jaccard similarity index.
引用
收藏
页码:260 / 269
页数:10
相关论文
共 23 条
[1]  
[Anonymous], 2013, research report
[2]  
[Anonymous], MICHIGAN STATE U
[3]  
[Anonymous], 2015, P 24 ACM INT C INF K
[4]  
Bartoli A., 2016, P ACM S APPL COMPUTI, P97, DOI DOI 10.1145/2851613.2851668
[5]   Inference of Regular Expressions for Text Extraction from Examples [J].
Bartoli, Alberto ;
De Lorenzo, Andrea ;
Medvet, Eric ;
Tarlao, Fabiano .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (05) :1217-1230
[6]   Learning Text Patterns Using Separate-and-Conquer Genetic Programming [J].
Bartoli, Alberto ;
De Lorenzo, Andrea ;
Medvet, Eric ;
Tarlao, Fabiano .
GENETIC PROGRAMMING (EUROGP 2015), 2015, 9025 :16-27
[7]   Automatic Synthesis of Regular Expressions from Examples [J].
Bartoli, Alberto ;
Davanzo, Giorgio ;
De Lorenzo, Andrea ;
Medvet, Eric ;
Sorio, Enrico .
COMPUTER, 2014, 47 (12) :72-80
[8]  
Bartoli A, 2012, PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION COMPANION (GECCO'12), P1477
[9]  
Brauer F., 2011, CIKM, V11, P1285, DOI 10.1145/2063576.2063763
[10]  
Cetinkaya A., 2007, Proceedings of the 9th Annual Conference Companion on Genetic and Evolutionary Computation. GECCO'07, P2643