Scalable Graph-Based Semi-Supervised Learning through Sparse Bayesian Model

被引：50

作者：

Jiang, Bingbing ^{[1
]}

Chen, Huanhuan ^{[1
]}

Yuan, Bo ^{[2
]}

Yao, Xin ^{[2
,3
]}

机构：

[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China

[2] Southern Univ Sci & Technol SUSTech, Shenzhen Key Lab Computat Intelligence, Dept Comp Sci & Engn, Shenzhen 518055, Guangdong, Peoples R China

[3] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2017年 / 29卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Semi-supervised learning; graph-based methods; sparse Bayesian model; incremental learning; large-scale data sets; CLASSIFICATION; ROBUSTNESS;

D O I：

10.1109/TKDE.2017.2749574

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semi-supervised learning (SSL) concerns the problem of how to improve classifiers' performance through making use of prior knowledge from unlabeled data. Many SSL methods have been developed to integrate unlabeled data into the classifiers based on either the manifold or cluster assumption in recent years. In particular, the graph-based approaches, following the manifold assumption, have achieved a promising performance in many real-world applications. However, most of them work well on small-scale data sets only and lack probabilistic outputs. In this paper, a scalable graph-based SSL framework through sparse Bayesian model is proposed by defining a graph-based sparse prior. Based on the traditional Bayesian inference technique, a sparse Bayesian SSL algorithm ((SBSL)-L-2) is obtained, which can remove the irrelevant unlabeled samples and make probabilistic prediction for out-of-sample data. Moreover, in order to scale (SBSL)-L-2 to large-scale data sets, an incremental (SBSL)-L-2 ((ISBSL)-L-2) is derived. The key idea of (ISBSL)-L-2 is employing an incremental strategy and sequentially selecting parts of unlabeled samples that contribute to the learning instead of using all available unlabeled samples directly. (ISBSL)-L-2 has lower time and space complexities than previous SSL algorithms with the use of all unlabeled samples. Extensive experiments on various data sets verify that our algorithms can achieve comparable classification effectiveness and efficiency with much better scalability. Finally, the generalization error bound is derived based on robustness analysis.

引用

页码：2758 / 2771

页数：14

共 50 条

[1] Graph-based sparse bayesian broad learning system for semi-supervised learning
Xu, Lili
Philip Chen, C.L.
Han, Ruizhi
Information Sciences, 2022, 597 : 193 - 210
[2] Graph-based sparse bayesian broad learning system for semi-supervised learning
Xu, Lili
Chen, C. L. Philip
Han, Ruizhi
INFORMATION SCIENCES, 2022, 597 : 193 - 210
[3] Graph-based semi-supervised learning
Zhang, Changshui
Wang, Fei
ARTIFICIAL LIFE AND ROBOTICS, 2009, 14 (04) : 445 - 448
[4] Graph-based semi-supervised learning
Subramanya, Amarnag
Talukdar, Partha Pratim
Synthesis Lectures on Artificial Intelligence and Machine Learning, 2014, 29 : 1 - 126
[5] Graph-based semi-supervised learning
Changshui Zhang
Fei Wang
Artificial Life and Robotics, 2009, 14 (4) : 445 - 448
[6] Graph-Based Semi-Supervised Learning as a Generative Model
He, Jingrui
Carbonell, Jaime
Liu, Yan
20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2492 - 2497
[7] Joint sparse graph and flexible embedding for graph-based semi-supervised learning
Dornaika, F.
El Traboulsi, Y.
NEURAL NETWORKS, 2019, 114 : 91 - 95
[8] Mixture distribution modeling for scalable graph-based semi-supervised learning
Li, Zhi
Li, Chaozhuo
Yang, Liqun
Yu, Philip S.
Li, Zhoujun
KNOWLEDGE-BASED SYSTEMS, 2020, 200
[9] Active Model Selection for Graph-Based Semi-Supervised Learning
Zhao, Bin
Wang, Fei
Zhang, Changshui
Song, Yangqiu
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1881 - 1884
[10] On Consistency of Graph-based Semi-supervised Learning
Du, Chengan
Zhao, Yunpeng
Wang, Feng
2019 39TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2019), 2019, : 483 - 491

← 1 2 3 4 5 →