Learning from crowds with decision trees

被引:14
|
作者
Yang, Wenjun [1 ]
Li, Chaoqun [1 ]
Jiang, Liangxiao [2 ]
机构
[1] China Univ Geosci, Sch Math & Phys, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
关键词
Crowdsourcing learning; Weighted majority voting; Decision trees; MODEL QUALITY; STATISTICAL COMPARISONS; WEIGHTING FILTER; IMPROVING DATA; CLASSIFIERS; TOOL;
D O I
10.1007/s10115-022-01701-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crowdsourcing systems provide an efficient way to collect labeled data by employing non-expert crowd workers. In practice, each instance obtains a multiple noisy label set from different workers. Ground truth inference algorithms are designed to infer the unknown true labels of data from multiple noisy label sets. Since there is substantial variation among different workers, evaluating the qualities of workers is crucial for ground truth inference. This paper proposes a novel algorithm called decision tree-based weighted majority voting (DTWMV). DTWMV directly takes the multiple noisy label set of each instance as its feature vector; that is, each worker is a feature of instances. Then sequential decision trees are built to calculate the weight of each feature (worker). Finally weighted majority voting is used to infer the integrated labels of instances. In DTWMV, evaluating the qualities of workers is converted to calculating the weights of features, which provides a new perspective for solving the ground truth inference problem. Then, a novel feature weight measurement based on decision trees is proposed. Our experimental results show that DTWMV can effectively evaluate the qualities of workers and improve the label quality of data.
引用
收藏
页码:2123 / 2140
页数:18
相关论文
共 50 条
  • [21] On learning decision trees with large output domains
    Bshouty, NH
    Tamon, C
    Wilson, DK
    ALGORITHMICA, 1998, 20 (01) : 77 - 100
  • [22] Quality Diversity Evolutionary Learning of Decision Trees
    Ferigo, Andrea
    Custode, Leonardo Lucio
    Iacca, Giovanni
    38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 425 - 432
  • [23] Geometric Heuristics for Transfer Learning in Decision Trees
    Chaubal, Siddhesh
    Rzepecki, Mateusz
    Nicholson, Patrick K.
    Piao, Guangyuan
    Sala, Alessandra
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 151 - 160
  • [24] LEARNING DECISION TREES USING THE FOURIER SPECTRUM
    KUSHILEVITZ, E
    MANSOUR, Y
    SIAM JOURNAL ON COMPUTING, 1993, 22 (06) : 1331 - 1348
  • [25] Learning monotone decision trees in polynomial time
    O'Donnell, Ryan
    Servedio, Rocco A.
    SIAM JOURNAL ON COMPUTING, 2007, 37 (03) : 827 - 844
  • [26] Evolving interpretable decision trees for reinforcement learning
    Costa, Vinicius G.
    Perez-Aracil, Jorge
    Salcedo-Sanz, Sancho
    Pedreira, Carlos E.
    ARTIFICIAL INTELLIGENCE, 2024, 327
  • [27] Learning to see the wood for the trees: machine learning, decision trees, and the classification of isolated theropod teeth
    Wills, Simon
    Underwood, Charlie J.
    Barrett, Paul M.
    PALAEONTOLOGY, 2021, 64 (01) : 75 - 99
  • [28] Learning from crowds with robust support vector machines
    Wenjun Yang
    Chaoqun Li
    Liangxiao Jiang
    Science China Information Sciences, 2023, 66
  • [29] Learning decision trees through Monte Carlo tree search: An empirical evaluation
    Nunes, Cecilia
    De Craene, Mathieu
    Langet, Helene
    Camara, Oscar
    Jonsson, Anders
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 10 (03)
  • [30] Properly learning decision trees in almost polynomial time
    Blanc, Guy
    Lange, Jane
    Qiao, Mingda
    Tan, Li-Yang
    2021 IEEE 62ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2021), 2022, : 920 - 929