Efficient Cluster-Based Boosting for Semisupervised Classification

被引:6
作者
Soares, Rodrigo G. F. [1 ]
Chen, Huanhuan [2 ]
Yao, Xin [3 ]
机构
[1] Univ Fed Rural Pernambuco, Dept Informat, BR-52171900 Recife, PE, Brazil
[2] Univ Sci & Technol China, Sch Comp Sci, UBRI, Hefei 230027, Anhui, Peoples R China
[3] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen Key Lab Computat Intelligence, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Cluster-based regularization; ensemble learning; multiclass classification; semisupervised classification;
D O I
10.1109/TNNLS.2018.2809623
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semisupervised classification (SSC) consists of using both labeled and unlabeled data to classify unseen instances. Due to the large number of unlabeled data typically available, SSC algorithms must be able to handle large-scale data sets. Recently, various ensemble algorithms have been introduced with improved generalization performance when compared to single classifiers. However, existing ensemble methods are not able to handle typical large-scale data sets. We propose efficient cluster-based boosting (ECB), a multiclass SSC algorithm with cluster-based regularization that avoids generating decision boundaries in high-density regions. A semisupervised selection procedure reduces time and space complexities by selecting only the most informative unlabeled instances for the training of each base learner. We provide evidences to demonstrate that ECB is able to achieve good performance with small amounts of selected data and a relatively small number of base learners. Our experiments confirmed that ECB scales to large data sets while delivering comparable generalization to state-of-the-art methods.
引用
收藏
页码:5667 / 5680
页数:14
相关论文
共 32 条
[1]  
[Anonymous], 2002, Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
[2]  
[Anonymous], 2003, P 20 INT C MACH LEAR, DOI DOI 10.1145/2612669.2612699
[3]  
[Anonymous], 2005, ICML
[4]  
Bishop Christopher M, 2016, Pattern recognition and machine learning
[5]  
Chapelle O., 2009, SEMISUPERVISED LEARN, V20, P542
[6]   Semi-Supervised Learning via Regularized Boosting Working on Multiple Semi-Supervised Assumptions [J].
Chen, Ke ;
Wang, Shihai .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (01) :129-143
[7]  
Chen X., 2011, P 25 AAAI C ART INT, P313, DOI DOI 10.1109/CVPR.2016.425
[8]  
Delalleau O., 2006, SEMISUPERVISED LEARN
[9]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[10]   The kernel recursive least-squares algorithm [J].
Engel, Y ;
Mannor, S ;
Meir, R .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (08) :2275-2285