Heterogeneous Committee-Based Active Learning for Entity Resolution (HeALER)

被引:8
|
作者
Chen, Xiao [1 ]
Xu, Yinlong [1 ]
Broneske, David [1 ]
Durand, Gabriel Campero [1 ]
Zoun, Roman [1 ]
Saake, Gunter [1 ]
机构
[1] Otto von Guericke Univ, Magdeburg, Germany
来源
ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2019 | 2019年 / 11695卷
关键词
Entity resolution; Query-by-committee-based active learning; Learning-based entity resolution; Record linkage; RULES;
D O I
10.1007/978-3-030-28730-6_5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Entity resolution identifies records that refer to the same real-world entity. For its classification step, supervised learning can be adopted, but this faces limitations in the availability of labeled training data. Under this situation, active learning has been proposed to gather labels while reducing the human labeling effort, by selecting the most informative data as candidates for labeling. Committee-based active learning is one of the most commonly used approaches, which chooses data with the most disagreement of voting results of the committee, considering this as the most informative data. However, the current stateof-the-art committee-based active learning approaches for entity resolution have two main drawbacks: First, the selected initial training data is usually not balanced and informative enough. Second, the committee is formed with homogeneous classifiers by comprising their accuracy to achieve diversity of the committee, i.e., the classifiers are not trained with all available training data or the best parameter setting. In this paper, we propose our committee-based active learning approach HeALER, which overcomes both drawbacks by using more effective initial training data selection approaches and a more effective heterogenous committee. We implemented HeALER and compared it with passive learning and other state-of-the-art approaches. The experiment results prove that our approach outperforms other state-of-the-art committee-based active learning approaches.
引用
收藏
页码:69 / 85
页数:17
相关论文
共 50 条
  • [1] Committee-Based Active Learning for Speech Recognition
    Hamanaka, Yuzo
    Shinoda, Koichi
    Tsutaoka, Takuya
    Furui, Sadaoki
    Emori, Tadashi
    Koshinaka, Takafumi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (10): : 2015 - 2023
  • [2] SPEECH MODELING BASED ON COMMITTEE-BASED ACTIVE LEARNING
    Hamanaka, Yuzo
    Shinoda, Koichi
    Furui, Sadaoki
    Emori, Tadashi
    Koshinaka, Takafumi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4350 - 4353
  • [3] Combining Committee-Based Semi-Supervised Learning and Active Learning
    Mohamed Farouk Abdel Hady
    Friedhelm Schwenker
    JournalofComputerScience&Technology, 2010, 25 (04) : 681 - 698
  • [4] Combining Committee-Based Semi-Supervised Learning and Active Learning
    Mohamed Farouk Abdel Hady
    Friedhelm Schwenker
    Journal of Computer Science and Technology, 2010, 25 : 681 - 698
  • [5] Combining Committee-Based Semi-Supervised Learning and Active Learning
    Hady, Mohamed Farouk Abdel
    Schwenker, Friedhelm
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (04) : 681 - 698
  • [6] Committee-Based Active Learning to Select Negative Examples for Predicting Protein Functions
    Frasca, Marco
    Sepehri, Maryam
    Petrini, Alessandro
    Grossi, Giuliano
    Valentini, Giorgio
    COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, CIBB 2018, 2020, 11925 : 80 - 87
  • [7] Weighted Committee-Based Structure Learning for Microarray Data
    Njah, Hasna
    Jamoussi, Salma
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2013,
  • [8] A new committee-based active learning (CBAL) approach to hyperspectral remote sensing data classification
    Xu, Jun
    Hang, Renlong
    REMOTE SENSING LETTERS, 2014, 5 (06) : 511 - 520
  • [9] AN ACTIVE LEARNING METHOD USING CLUSTERING AND COMMITTEE-BASED SAMPLE SELECTION FOR SOUND EVENT CLASSIFICATION
    Zhao Shuyang
    Heittola, Toni
    Virtanen, Tuomas
    2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 116 - 120
  • [10] Informativeness-Based Active Learning for Entity Resolution
    Christen, Victor
    Christen, Peter
    Rahm, Erhard
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 125 - 141