Learning Expressive Linkage Rules using Genetic Programming

被引:51
|
作者
Isele, Robert [1 ]
Bizer, Christian [1 ]
机构
[1] Free Univ Berlin, Web Based Syst Grp, Garystr 21, D-14195 Berlin, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2012年 / 5卷 / 11期
关键词
D O I
10.14778/2350229.2350276
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A central problem in data integration and data cleansing is to find entities in different data sources that describe the same real-world object. Many existing methods for identifying such entities rely on explicit linkage rules which specify the conditions that entities must fulfill in order to be considered to describe the same real-world object. In this paper, we present the GenLink algorithm for learning expressive linkage rules from a set of existing reference links using genetic programming. The algorithm is capable of generating linkage rules which select discriminative properties for comparison, apply chains of data transformations to normalize property values, choose appropriate distance measures and thresholds and combine the results of multiple comparisons using non-linear aggregation functions. Our experiments show that the GenLink algorithm outperforms the state-of-the-art genetic programming approach to learning linkage rules recently presented by Carvalho et. al. and is capable of learning linkage rules which achieve a similar accuracy as human written rules for the same problem.
引用
收藏
页码:1638 / 1649
页数:12
相关论文
共 50 条
  • [21] Learning iterative dispatching rules for job shop scheduling with genetic programming
    Su Nguyen
    Mengjie Zhang
    Mark Johnston
    Kay Chen Tan
    The International Journal of Advanced Manufacturing Technology, 2013, 67 : 85 - 100
  • [22] Evaluating Class Association Rules using Genetic Relation Programming
    Gonzales, Eloy
    Taboada, Karla
    Shimada, Kaoru
    Mabu, Shingo
    Hirasawa, Kotaro
    2008 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-8, 2008, : 731 - 736
  • [23] Discovering Fuzzy Classification Rules using Genetic Network Programming
    Taboada, Karla
    Gonzales, Eloy
    Shimada, Kaoru
    Mabu, Shingo
    Hirasawa, Kotaro
    2008 PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-7, 2008, : 1723 - 1728
  • [24] Mining multiple comprehensible classification rules using genetic programming
    Tan, KC
    Tay, A
    Lee, TH
    Heng, CM
    CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 1302 - 1307
  • [25] Learning comprehensible classification rules from gene expression data using genetic programming and biological ontologies
    Goertzel, Ben
    Coelho, Locio De Souza
    Pennachin, Cassio
    Goertzel, Izabela Freire
    De Queiroz, Murilo Saraiva
    Prosdocimi, Francisco
    Lobo, Francisco Pereira
    APPLIED ARTIFICIAL INTELLIGENCE, 2006, : 573 - +
  • [26] Multi-view semi-supervised learning using Genetic Programming interpretable classification rules
    Garcia-Martinez, Carlos
    Ventura, Sebastian
    2017 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2017, : 573 - 579
  • [27] Multitask Visual Learning Using Genetic Programming
    Jaskowski, Wojciech
    Krawiec, Krzysztof
    Wieloch, Bartosz
    EVOLUTIONARY COMPUTATION, 2008, 16 (04) : 439 - 459
  • [28] Algorithm Tuners for PSO Methods and Genetic Programming Techniques for Learning Tuning Rules
    Kanemasa, Minoru
    Aiyoshi, Eitaro
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2014, 9 (04) : 407 - 414
  • [29] Machine learning of symbolic compositional rules with genetic programming: dissonance treatment in Palestrina
    Anders, Torsten
    Inden, Benjamin
    PEERJ COMPUTER SCIENCE, 2019, 2019 (12) : 1 - 19
  • [30] Unsupervised learning of word segmentation rules with genetic algorithms and inductive logic programming
    Kazakov, D
    Manandhar, S
    MACHINE LEARNING, 2001, 43 (1-2) : 121 - 162