Duplicate Detection Exploiting Data Relationships

被引:0
作者
Herschel, Melanie [1 ]
机构
[1] Univ Tubingen, Wilhelm Schickard Inst Informat, Lehrstuhl Datenbanksyst, Sand 13, D-72076 Tubingen, Germany
来源
IT-INFORMATION TECHNOLOGY | 2009年 / 51卷 / 04期
关键词
H.2 [Information Systems: Database Management; H.2.5 [Information Systems: Database Management: Heterogeneous Databases; dublication detection; algorithms; performance; data quality; data integration;
D O I
10.1524/itit.2009.0546
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Duplicate detection consists in identifying multiple, different data base representations of a same real-world object. State-of-the-art duplicate detection systems usually concentrate on identifying duplicates in a single relational table and thereby ignore that the data may exist in a larger context that, when considered, can significantly improve the performance of duplicate detection. In this paper, we present algorithms that exploit relationships that exist in the data.
引用
收藏
页码:231 / 234
页数:4
相关论文
共 50 条
[21]   Fast Near-Duplicate Image Detection Using Uniform Randomized Trees [J].
Lei, Yanqiang ;
Qiu, Guoping ;
Zheng, Ligang ;
Huang, Jiwu .
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2014, 10 (04)
[22]   Cyber-Attack Detection in Socio-Technical Transportation Systems Exploiting Redundancies Between Physical and Social Data [J].
Roy, Tanushree ;
Sattarzadeh, Sara ;
Dey, Satadru .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (03) :1477-1488
[23]   Two-Channel Passive Detection Exploiting Cyclostationarity [J].
Horstmann, Stefanie ;
Ramirez, David ;
Schreier, Peter J. .
2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
[24]   Exploiting Surroundedness for Saliency Detection: A Boolean Map Approach [J].
Zhang, Jianming ;
Sclaroff, Stan .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (05) :889-902
[25]   Exploiting k-constraints to reduce memory overhead in continuous queries over data streams [J].
Babu, S ;
Srivastava, U ;
Widom, J .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2004, 29 (03) :545-580
[26]   Exploiting Data Source Distribution to Enhance NVM Reliability [J].
Berman, Amit .
PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, MEMSYS 2022, 2022,
[27]   Exploiting Context and Quality for Linked Data Source Selection [J].
Catania, Barbara ;
Guerrini, Giovanna ;
Yaman, Beyza .
SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, :2251-2258
[28]   Towards a semantic framework for exploiting heterogeneous environmental data [J].
Tran B.-H. ;
Bouju A. ;
Plumejeaud-Perreau C. ;
Bretagnolle V. .
International Journal of Metadata, Semantics and Ontologies, 2016, 11 (03) :191-205
[29]   Web-based Arabic/English Duplicate Record Detection with Nested Blocking Technique [J].
Higazy, Azza ;
El Tobely, Tarek ;
Yousef, Ahmed H. ;
Sarhan, Amany .
2013 8TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2013, :313-318
[30]   CONFLICT DETECTION TRADEOFFS FOR REPLICATED DATA [J].
CAREY, MJ ;
LIVNY, M .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 1991, 16 (04) :703-746