Learning from crowds with decision trees

被引:14
|
作者
Yang, Wenjun [1 ]
Li, Chaoqun [1 ]
Jiang, Liangxiao [2 ]
机构
[1] China Univ Geosci, Sch Math & Phys, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
关键词
Crowdsourcing learning; Weighted majority voting; Decision trees; MODEL QUALITY; STATISTICAL COMPARISONS; WEIGHTING FILTER; IMPROVING DATA; CLASSIFIERS; TOOL;
D O I
10.1007/s10115-022-01701-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crowdsourcing systems provide an efficient way to collect labeled data by employing non-expert crowd workers. In practice, each instance obtains a multiple noisy label set from different workers. Ground truth inference algorithms are designed to infer the unknown true labels of data from multiple noisy label sets. Since there is substantial variation among different workers, evaluating the qualities of workers is crucial for ground truth inference. This paper proposes a novel algorithm called decision tree-based weighted majority voting (DTWMV). DTWMV directly takes the multiple noisy label set of each instance as its feature vector; that is, each worker is a feature of instances. Then sequential decision trees are built to calculate the weight of each feature (worker). Finally weighted majority voting is used to infer the integrated labels of instances. In DTWMV, evaluating the qualities of workers is converted to calculating the weights of features, which provides a new perspective for solving the ground truth inference problem. Then, a novel feature weight measurement based on decision trees is proposed. Our experimental results show that DTWMV can effectively evaluate the qualities of workers and improve the label quality of data.
引用
收藏
页码:2123 / 2140
页数:18
相关论文
共 50 条
  • [1] Learning from crowds with decision trees
    Wenjun Yang
    Chaoqun Li
    Liangxiao Jiang
    Knowledge and Information Systems, 2022, 64 : 2123 - 2140
  • [2] Learning from crowds with robust support vector machines
    Yang, Wenjun
    Li, Chaoqun
    Jiang, Liangxiao
    SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (03)
  • [3] Learning decision trees for the partial label ranking problem
    Alfaro, Juan C.
    Aledo, Juan A.
    Gamez, Jose A.
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (02) : 890 - 918
  • [4] Learning Decision Trees from Distributed Datasets
    Xie Hongxia
    Shi Liping
    Meng Fanrong
    Wang Chun
    DCABES 2008 PROCEEDINGS, VOLS I AND II, 2008, : 96 - +
  • [5] Learning from crowds with robust logistic regression
    Li, Wenbin
    Li, Chaoqun
    Jiang, Liangxiao
    INFORMATION SCIENCES, 2023, 639
  • [6] Ensemble Learning from Crowds
    Zhang, Jing
    Wu, Ming
    Sheng, Victor S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (08) : 1506 - 1519
  • [7] Decision Trees Learning System
    Paliwoda, M
    INTELLIGENT INFORMATION SYSTEMS 2002, PROCEEDINGS, 2002, 17 : 77 - 90
  • [8] Agnostically Learning Decision Trees
    Gopalan, Parikshit
    Kalai, Adam Tauman
    Klivans, Adam R.
    STOC'08: PROCEEDINGS OF THE 2008 ACM INTERNATIONAL SYMPOSIUM ON THEORY OF COMPUTING, 2008, : 527 - +
  • [9] Learning fuzzy decision trees
    Apolloni, B
    Zamponi, G
    Zanaboni, AM
    NEURAL NETWORKS, 1998, 11 (05) : 885 - 895
  • [10] Classification with decision trees from a nonparametric predictive inference perspective
    Abellan, Joaquin
    Baker, Rebecca M.
    Coolen, Frank P. A.
    Crossman, Richard J.
    Masegosa, Andres R.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 : 789 - 802