Learning from crowds with decision trees

被引：14

作者：

Yang, Wenjun ^{[1
]}

Li, Chaoqun ^{[1
]}

Jiang, Liangxiao ^{[2
]}

机构：

[1] China Univ Geosci, Sch Math & Phys, Wuhan 430074, Peoples R China

[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2022年 / 64卷 / 08期

关键词：

Crowdsourcing learning; Weighted majority voting; Decision trees; MODEL QUALITY; STATISTICAL COMPARISONS; WEIGHTING FILTER; IMPROVING DATA; CLASSIFIERS; TOOL;

D O I：

10.1007/s10115-022-01701-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Crowdsourcing systems provide an efficient way to collect labeled data by employing non-expert crowd workers. In practice, each instance obtains a multiple noisy label set from different workers. Ground truth inference algorithms are designed to infer the unknown true labels of data from multiple noisy label sets. Since there is substantial variation among different workers, evaluating the qualities of workers is crucial for ground truth inference. This paper proposes a novel algorithm called decision tree-based weighted majority voting (DTWMV). DTWMV directly takes the multiple noisy label set of each instance as its feature vector; that is, each worker is a feature of instances. Then sequential decision trees are built to calculate the weight of each feature (worker). Finally weighted majority voting is used to infer the integrated labels of instances. In DTWMV, evaluating the qualities of workers is converted to calculating the weights of features, which provides a new perspective for solving the ground truth inference problem. Then, a novel feature weight measurement based on decision trees is proposed. Our experimental results show that DTWMV can effectively evaluate the qualities of workers and improve the label quality of data.

引用

页码：2123 / 2140

页数：18

共 50 条

[21] On learning decision trees with large output domains
Bshouty, NH
Tamon, C
Wilson, DK
ALGORITHMICA, 1998, 20 (01) : 77 - 100
[22] Quality Diversity Evolutionary Learning of Decision Trees
Ferigo, Andrea
Custode, Leonardo Lucio
Iacca, Giovanni
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 425 - 432
[23] Geometric Heuristics for Transfer Learning in Decision Trees
Chaubal, Siddhesh
Rzepecki, Mateusz
Nicholson, Patrick K.
Piao, Guangyuan
Sala, Alessandra
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 151 - 160
[24] LEARNING DECISION TREES USING THE FOURIER SPECTRUM
KUSHILEVITZ, E
MANSOUR, Y
SIAM JOURNAL ON COMPUTING, 1993, 22 (06) : 1331 - 1348
[25] Learning monotone decision trees in polynomial time
O'Donnell, Ryan
Servedio, Rocco A.
SIAM JOURNAL ON COMPUTING, 2007, 37 (03) : 827 - 844
[26] Evolving interpretable decision trees for reinforcement learning
Costa, Vinicius G.
Perez-Aracil, Jorge
Salcedo-Sanz, Sancho
Pedreira, Carlos E.
ARTIFICIAL INTELLIGENCE, 2024, 327
[27] Learning to see the wood for the trees: machine learning, decision trees, and the classification of isolated theropod teeth
Wills, Simon
Underwood, Charlie J.
Barrett, Paul M.
PALAEONTOLOGY, 2021, 64 (01) : 75 - 99
[28] Learning from crowds with robust support vector machines
Wenjun Yang
Chaoqun Li
Liangxiao Jiang
Science China Information Sciences, 2023, 66
[29] Learning decision trees through Monte Carlo tree search: An empirical evaluation
Nunes, Cecilia
De Craene, Mathieu
Langet, Helene
Camara, Oscar
Jonsson, Anders
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 10 (03)
[30] Properly learning decision trees in almost polynomial time
Blanc, Guy
Lange, Jane
Qiao, Mingda
Tan, Li-Yang
2021 IEEE 62ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2021), 2022, : 920 - 929

← 1 2 3 4 5 →