SemiBoost: Boosting for Semi-Supervised Learning

被引:199
|
作者
Mallapragada, Pavan Kumar [1 ]
Jin, Rong [1 ]
Jain, Anil K. [1 ]
Liu, Yi [1 ]
机构
[1] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48823 USA
基金
美国国家科学基金会;
关键词
Machine learning; semi-supervised learning; semi-supervised improvement; manifold assumption; cluster assumption; boosting;
D O I
10.1109/TPAMI.2008.235
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by using the available unlabeled examples. We call this as the Semi-supervised improvement problem, to distinguish the proposed approach from the existing approaches. We design a metasemi-supervised learning algorithm that wraps around the underlying supervised algorithm and improves its performance using unlabeled data. This problem is particularly important when we need to train a supervised learning algorithm with a limited number of labeled examples and a multitude of unlabeled examples. We present a boosting framework for semi-supervised learning, termed as SemiBoost. The key advantages of the proposed semi-supervised learning approach are: 1) performance improvement of any supervised learning algorithm with a multitude of unlabeled data, 2) efficient computation by the iterative boosting algorithm, and 3) exploiting both manifold and cluster assumption in training classification models. An empirical study on 16 different data sets and text categorization demonstrates that the proposed framework improves the performance of several commonly used supervised learning algorithms, given a large number of unlabeled examples. We also show that the performance of the proposed algorithm, SemiBoost, is comparable to the state-of-the-art semi-supervised learning algorithms.
引用
收藏
页码:2000 / 2014
页数:15
相关论文
共 50 条
  • [21] Quantum semi-supervised kernel learning
    Saeedi, Seyran
    Panahi, Aliakbar
    Arodz, Tom
    QUANTUM MACHINE INTELLIGENCE, 2021, 3 (02)
  • [22] Semi-Supervised Learning for Intelligent Surveillance
    de Freitas, Guilherme Correa
    Maximo, Marcos R. O. A.
    Verri, Filipe A. N.
    2022 LATIN AMERICAN ROBOTICS SYMPOSIUM (LARS), 2022 BRAZILIAN SYMPOSIUM ON ROBOTICS (SBR), AND 2022 WORKSHOP ON ROBOTICS IN EDUCATION (WRE), 2022, : 306 - 311
  • [23] Quantum annealing for semi-supervised learning
    Zheng, Yu-Lin
    Zhang, Wen
    Zhou, Cheng
    Geng, Wei
    CHINESE PHYSICS B, 2021, 30 (04)
  • [24] Semi-supervised learning using hidden feature augmentation
    Hang, Wenlong
    Choi, Kup-Sze
    Wang, Shitong
    Qian, Pengjiang
    APPLIED SOFT COMPUTING, 2017, 59 : 448 - 461
  • [25] Semi-supervised learning in cancer diagnostics
    Eckardt, Jan-Niklas
    Bornhaeuser, Martin
    Wendt, Karsten
    Middeke, Jan Moritz
    FRONTIERS IN ONCOLOGY, 2022, 12
  • [26] A Discriminative Model for Semi-Supervised Learning
    Balcan, Maria-Florina
    Blum, Avrim
    JOURNAL OF THE ACM, 2010, 57 (03)
  • [27] Information Theoretic Regularization for Semi-Supervised Boosting
    Zheng, Lei
    Wang, Shaojun
    Liu, Yan
    Lee, Chi-Hoon
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 1017 - 1025
  • [28] Asymmetric Semi-Supervised Boosting Scheme for Interactive Image Retrieval
    Wu, Jun
    Lu, Ming-Yu
    ETRI JOURNAL, 2010, 32 (05) : 766 - 773
  • [29] Introduction to semi-supervised learning
    Goldberg, Xiaojin
    Synthesis Lectures on Artificial Intelligence and Machine Learning, 2009, 6 : 1 - 116
  • [30] Reliable Semi-supervised Learning
    Shao, Junming
    Huang, Chen
    Yang, Qinli
    Luo, Guangchun
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 1197 - 1202