SemiBoost: Boosting for Semi-Supervised Learning

被引:199
|
作者
Mallapragada, Pavan Kumar [1 ]
Jin, Rong [1 ]
Jain, Anil K. [1 ]
Liu, Yi [1 ]
机构
[1] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48823 USA
基金
美国国家科学基金会;
关键词
Machine learning; semi-supervised learning; semi-supervised improvement; manifold assumption; cluster assumption; boosting;
D O I
10.1109/TPAMI.2008.235
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by using the available unlabeled examples. We call this as the Semi-supervised improvement problem, to distinguish the proposed approach from the existing approaches. We design a metasemi-supervised learning algorithm that wraps around the underlying supervised algorithm and improves its performance using unlabeled data. This problem is particularly important when we need to train a supervised learning algorithm with a limited number of labeled examples and a multitude of unlabeled examples. We present a boosting framework for semi-supervised learning, termed as SemiBoost. The key advantages of the proposed semi-supervised learning approach are: 1) performance improvement of any supervised learning algorithm with a multitude of unlabeled data, 2) efficient computation by the iterative boosting algorithm, and 3) exploiting both manifold and cluster assumption in training classification models. An empirical study on 16 different data sets and text categorization demonstrates that the proposed framework improves the performance of several commonly used supervised learning algorithms, given a large number of unlabeled examples. We also show that the performance of the proposed algorithm, SemiBoost, is comparable to the state-of-the-art semi-supervised learning algorithms.
引用
收藏
页码:2000 / 2014
页数:15
相关论文
共 50 条
  • [1] Semi-Supervised Learning via Regularized Boosting Working on Multiple Semi-Supervised Assumptions
    Chen, Ke
    Wang, Shihai
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (01) : 129 - 143
  • [2] Boosting for multiclass semi-supervised learning
    Tanha, Jafar
    van Someren, Maarten
    Afsarmanesh, Hamideh
    PATTERN RECOGNITION LETTERS, 2014, 37 : 63 - 77
  • [3] MSSBoost: A new multiclass boosting to semi-supervised learning
    Tanha, Jafar
    NEUROCOMPUTING, 2018, 314 : 251 - 266
  • [4] Multiclass Semi-Supervised Boosting Using Similarity Learning
    Tanha, Jafar
    Saberian, Mohammad Javad
    van Someren, Maarten
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 1205 - 1210
  • [5] SIMILARITY LEARNING FOR SEMI-SUPERVISED MULTI-CLASS BOOSTING
    Wang, Q. Y.
    Yuen, P. C.
    Feng, G. C.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2164 - 2167
  • [6] Boosting semi-supervised learning with Contrastive Complementary Labeling
    Deng, Qinyi
    Guo, Yong
    Yang, Zhibang
    Pan, Haolin
    Chen, Jian
    NEURAL NETWORKS, 2024, 170 : 417 - 426
  • [7] A hybrid semi-supervised boosting to sentiment analysis
    Tanha, Jafar
    Mahmudyan, Solmaz
    Farahi, Ahmad
    INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2021, 12 (02): : 1769 - 1784
  • [8] Semi-supervised learning by disagreement
    Zhou, Zhi-Hua
    Li, Ming
    KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 24 (03) : 415 - 439
  • [9] A survey on semi-supervised learning
    Jesper E. van Engelen
    Holger H. Hoos
    Machine Learning, 2020, 109 : 373 - 440
  • [10] Semi-supervised learning by disagreement
    Zhi-Hua Zhou
    Ming Li
    Knowledge and Information Systems, 2010, 24 : 415 - 439