Global multiclass classification and dataset construction via heterogeneous local experts

被引:1
|
作者
Ahn S. [1 ]
Özgür A. [1 ]
Pilanci M. [1 ]
机构
[1] Department of Electrical Engineering, Stanford University, Stanford, 94305, CA
来源
IEEE Journal on Selected Areas in Information Theory | 2020年 / 1卷 / 03期
基金
美国国家科学基金会;
关键词
Crowdsourcing; Dataset construction; Ensemble learning; Federated learning; Heterogeneous data; Multiclass classification; Set cover problem;
D O I
10.1109/JSAIT.2020.3041804
中图分类号
学科分类号
摘要
In the domains of dataset construction and crowdsourcing, a notable challenge is to aggregate labels from a heterogeneous set of labelers, each of whom is potentially an expert in some subset of tasks (and less reliable in others). To reduce costs of hiring human labelers or training automated labeling systems, it is of interest to minimize the number of labelers while ensuring the reliability of the resulting dataset. We model this as the problem of performing K-class classification using the predictions of smaller classifiers, each trained on a subset of [K], and derive bounds on the number of classifiers needed to accurately infer the true class of an unlabeled sample under both adversarial and stochastic assumptions. By exploiting a connection to the classical set cover problem, we produce a near-optimal scheme for designing such configurations of classifiers which recovers the well known one-vs.-one classification approach as a special case. Experiments with the MNIST and CIFAR-10 datasets demonstrate the favorable accuracy (compared to a centralized classifier) of our aggregation scheme applied to classifiers trained on subsets of the data. These results suggest a new way to automatically label data or adapt an existing set of local classifiers to larger-scale multiclass problems. © 2020 IEEE Journal on Selected Areas in Information Theory.All right reserved.
引用
收藏
页码:870 / 883
页数:13
相关论文
共 26 条
  • [21] Multi-class classification via heterogeneous ensemble of one-class classifiers
    Kang, Seokho
    Cho, Sungzoon
    Rang, Pilsung
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 43 : 35 - 43
  • [22] Collaborative Global-Local Structure Network With Knowledge Distillation for Imbalanced Data Classification
    Wu, Feiyan
    Liu, Zhunga
    Zhang, Zuowei
    Liu, Jiaxiang
    Wang, Longfei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2450 - 2460
  • [23] Improving Federated Learning on Heterogeneous Data via Serial Pipeline Training and Global Knowledge Regularization
    Luo, Yiyang
    Lu, Ting
    Chang, Shan
    Wang, Bingyue
    2022 IEEE 28TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, ICPADS, 2022, : 851 - 858
  • [24] Incomplete multi-view clustering via local and global bagging of anchor graphs
    Li, Ao
    Xu, Haoyue
    Feng, Cong
    Yang, Hailu
    Xu, Shibiao
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 248
  • [25] Individual-Specific Classification of Mental Workload Levels Via an Ensemble Heterogeneous Extreme Learning Machine for EEG Modeling
    Tao, Jiadong
    Yin, Zhong
    Liu, Lei
    Tian, Ying
    Sun, Zhanquan
    Zhang, Jianhua
    SYMMETRY-BASEL, 2019, 11 (07):
  • [26] FedBPM:A decentralized federated meta-method for heterogeneous and complex image classification via multi-scale feature fusion
    Liu, Wei
    Li, Kaige
    Zheng, Yurong
    She, Wei
    Tian, Zhao
    COMPUTING, 2025, 107 (02)