Matching Attributes Across Overlapping Heterogeneous Data Sources Using Mutual Information

被引:3
作者
Zhao, Huimin [1 ]
机构
[1] Univ Wisconsin Milwaukee, Sheldon B Lubar Sch Business, Milwaukee, WI 53201 USA
关键词
Attribute Correspondence; Attribute Matching; Heterogeneous Databases; Information Theory; Mutual Information; SEMANTIC-INTEGRATION; SCHEMA; CORRESPONDENCES; RETRIEVAL; DATABASES;
D O I
10.4018/jdm.2010100105
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Identifying matching attributes across heterogeneous data sources is a critical and time-consuming step in integrating the data sources. In this paper, the author proposes a method for matching the most frequently encountered types of attributes across overlapping heterogeneous data sources. The author uses mutual information as a unified measure of dependence on various types of attributes. An example is used to demonstrate the utility of the proposed method, which is useful in developing practical attribute matching tools.
引用
收藏
页码:91 / 110
页数:20
相关论文
共 48 条
  • [41] On universal simulation of information sources using training data (vol 50, pg 5, 2004)
    Merhav, N
    Weinberger, MJ
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (09) : 3381 - 3383
  • [42] Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data
    Carsten O Daub
    Ralf Steuer
    Joachim Selbig
    Sebastian Kloska
    [J]. BMC Bioinformatics, 5
  • [43] Dimensionality Reduction of Hybrid Data Using Mutual Information-Based Unsupervised Feature Transformation: with Application on Intrusion Detection
    Wei, Min
    Chan, Rosa H. M.
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2015, : 1108 - 1111
  • [44] A procedure using support vector data description and mutual information for end price assessment in online C2C auction
    Jenamani, Mamata
    Routray, Aurobinda
    Singh, Vikash
    [J]. ELECTRONIC COMMERCE RESEARCH, 2011, 11 (03) : 321 - 340
  • [45] Learning transcriptional regulatory networks from high throughput gene expression data using continuous three-way mutual information
    Weijun Luo
    Kurt D Hankenson
    Peter J Woolf
    [J]. BMC Bioinformatics, 9
  • [46] A procedure using support vector data description and mutual information for end price assessment in online C2C auction
    Mamata Jenamani
    Aurobinda Routray
    Vikash Singh
    [J]. Electronic Commerce Research, 2011, 11 : 321 - 340
  • [47] Co-registration of LISS-4 multispectral band data using mutual information-based stochastic gradient descent optimization
    Moorthi, S. Manthira
    Dhar, D.
    Sivakumar, R.
    [J]. CURRENT SCIENCE, 2017, 113 (05): : 877 - 888
  • [48] Decoding speech information from EEG data with 4-, 7-and 11-month-old infants: Using convolutional neural network, mutual information-based and backward linear models
    Keshavarzi, Mahmoud
    Attaheri, Adam
    Rocha, Sinead
    Brusini, Perrine
    Gibbon, Samuel
    Boutris, Panagiotis
    Mead, Natasha
    Olawole-Scott, Helen
    Ahmed, Henna
    Flanagan, Sheila
    Mandke, Kanad
    Goswami, Usha
    [J]. JOURNAL OF NEUROSCIENCE METHODS, 2024, 403