Duplicate Data Detection Using GNN

被引:0
|
作者
Lu, Hanrong [1 ]
Chen, Xin [1 ]
Lan, Xuhui [1 ]
Zheng, Feng [1 ]
机构
[1] Air Force Early Warning Acad, Dept Early Warning Intelligence, Wuhan, Peoples R China
来源
PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2016) | 2016年
关键词
record detection; data cleaning; neural network; genetic algorithm; GNN; NETWORK; PERFORMANCE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In some applications like data warehousing, data mining or information integration data must be cleaned as a preprocessing step to ensure the quality of data and the performance of applications. An essential work in data cleaning is duplicate record detection. Existing detection methods apply to different data models and record types. Certain difficulties with those studies are still to be overcome. This paper proposes a genetic neural network based approach to duplicate record detection. The topology and weight vector of a neural network are firstly optimized by a genetic algorithm for the given data set before it is used to perform the detection. The method can enhance the detection accuracy and alleviate the many problems with previous works.
引用
收藏
页码:167 / 170
页数:4
相关论文
共 50 条
  • [41] Automatic Numerical Question Answering on Table using BERT-GNN
    Bagwe, Ruchi
    George, Kiran
    2020 11TH IEEE ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2020, : 118 - 124
  • [42] Gene Data Analysis for Disease Detection Using Data Mining Algorithms
    Raman, Ramakrishnan
    CARDIOMETRY, 2022, (25): : 178 - 181
  • [43] RoSGAS: Adaptive Social Bot Detection with Reinforced Self-supervised GNN Architecture Search
    Yang, Yingguang
    Yang, Renyu
    Li, Yangyang
    Cui, Kai
    Yang, Zhiqin
    Wang, Yue
    Xu, Jie
    Xie, Haiyong
    ACM TRANSACTIONS ON THE WEB, 2023, 17 (03)
  • [44] Facial feature point recognition method for human motion image using GNN
    Wang, Qingwei
    Zhang, Xiaolong
    Li, Xiaofeng
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2022, 19 (04) : 3803 - 3819
  • [45] USING THE NEURAL NETWORK MODEL OF DATA PROCESSING IN THE FIRE DETECTION SYSTEMS FOR DETERMINATION OF THE IGNITION LOCATION
    Guseva, Alena, I
    Malykhina, Galina F.
    Krugleevsky, Vladimir N.
    Turusov, Sergey N.
    Obraztsov, Ivan, V
    MARINE INTELLECTUAL TECHNOLOGIES, 2019, 2 (02): : 93 - 102
  • [46] Learning Retention Mechanisms and Evolutionary Parameters of Duplicate Genes from Their Expression Data
    DeGiorgio, Michael
    Assis, Raquel
    MOLECULAR BIOLOGY AND EVOLUTION, 2021, 38 (03) : 1209 - 1224
  • [47] Network Intrusion Detection System Using Data Mining
    Lima de Campos, Lidio Mauro
    Limao de Oliveira, Roberto Celio
    Roisenberg, Mauro
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, 2012, 311 : 104 - 113
  • [48] Outlier Detection for Monitoring Data Using Stacked Autoencoder
    Wan, Fangyi
    Guo, Gaodeng
    Zhang, Chunlin
    Guo, Qing
    Liu, Jie
    IEEE ACCESS, 2019, 7 : 173827 - 173837
  • [49] Encoding Histopathological WSIs Using GNN for Scalable Diagnostically Relevant Regions Retrieval
    Zheng, Yushan
    Jiang, Bonan
    Shi, Jun
    Zhang, Haopeng
    Xie, Fengying
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT I, 2019, 11764 : 550 - 558
  • [50] Generating Occupancy Profiles for Building Simulations Using a Hybrid GNN and LSTM Framework
    Xie, Yuan
    Stravoravdis, Spyridon
    ENERGIES, 2023, 16 (12)