Reconciling schemas of disparate data sources: A machine-learning approach

被引:0
|
作者
Doan, AH [1 ]
Domingos, P [1 ]
Halevy, A [1 ]
机构
[1] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A data-integration system provides access to a multitude of data sources through a single mediated schema. A key bottleneck in building such systems has been the laborious manual construction of semantic mappings between the source schemas and the mediated schema. We describe LSD, a system that employs and extends current machine-learning techniques to semi-automatically find such mappings. LSD first asks the user to provide the semantic mappings for a small set of data sources, then uses these mappings together with the sources to train a set of learners. Each learner exploits a different type of information either in the source schemas or in their data. Once the learners have been trained, LSD finds semantic mappings for a new data source by applying the learners, then combining their predictions using a meta-learner. To further improve matching accuracy, we extend machine learning techniques so that LSD can incorporate domain constraints as:an additional source of knowledge, and develop a novel learner that utilizes the structural information in XML documents. Our approach thus is distinguished in that it incorporates multiple types of knowledge. Importantly, its architecture is extensible to additional learners that may exploit new kinds of information. We describe a set of experiments on several real-world domains, and show that LSD proposes semantic mappings with a high degree of accuracy.
引用
收藏
页码:509 / 520
页数:12
相关论文
共 50 条
  • [21] A machine-learning approach to predict postprandial hypoglycemia
    Seo, Wonju
    Lee, You-Bin
    Lee, Seunghyun
    Jin, Sang-Man
    Park, Sung-Min
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (01)
  • [22] Machine-learning approach for discovery of conventional superconductors
    Tran, Huan
    Vu, Tuoc N.
    PHYSICAL REVIEW MATERIALS, 2023, 7 (05)
  • [23] Machine-learning approach predicts RNA structures
    Arnaud, Celia
    CHEMICAL & ENGINEERING NEWS, 2021, 99 (32) : 8 - 8
  • [24] A Machine-learning Approach to Enhancing eROSITA Observations
    Soltis, John
    Ntampaka, Michelle
    Wu, John F.
    ZuHone, John
    Evrard, August
    Farahi, Arya
    Ho, Matthew
    Nagai, Daisuke
    ASTROPHYSICAL JOURNAL, 2022, 940 (01):
  • [25] Forecasting client retention - A machine-learning approach
    Elisa Schaeffer, Satu
    Rodriguez Sanchez, Sara Veronica
    JOURNAL OF RETAILING AND CONSUMER SERVICES, 2020, 52
  • [26] A machine-learning approach to ranking RDF properties
    Dessi, Andrea
    Atzori, Maurizio
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 54 : 366 - 377
  • [27] A machine-learning approach to a mobility policy proposal
    Shulajkovska, Miljana
    Smerkol, Maj
    Dovgan, Erik
    Gams, Matjaz
    HELIYON, 2023, 9 (10)
  • [28] A machine-learning approach to optimal bid pricing
    Lawrence, RD
    COMPUTATIONAL MODELING AND PROBLEM SOLVING IN THE NETWORKED WORLD: INTERFACES IN COMPUTER SCIENCE AND OPERATIONS RESEARCH, 2002, 21 : 97 - 118
  • [29] Examining the radius valley: a machine-learning approach
    MacDonald, Mariah G.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2019, 487 (04) : 5062 - 5069
  • [30] A Machine-Learning Approach to Autonomous Music Composition
    Lichtenwalter, Ryan
    Lichtenwalter, Katerina
    Chawla, Nitesh
    JOURNAL OF INTELLIGENT SYSTEMS, 2010, 19 (02) : 95 - 123