Scalable Graph-Based Semi-Supervised Learning through Sparse Bayesian Model

被引:50
作者
Jiang, Bingbing [1 ]
Chen, Huanhuan [1 ]
Yuan, Bo [2 ]
Yao, Xin [2 ,3 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
[2] Southern Univ Sci & Technol SUSTech, Shenzhen Key Lab Computat Intelligence, Dept Comp Sci & Engn, Shenzhen 518055, Guangdong, Peoples R China
[3] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England
基金
中国国家自然科学基金;
关键词
Semi-supervised learning; graph-based methods; sparse Bayesian model; incremental learning; large-scale data sets; CLASSIFICATION; ROBUSTNESS;
D O I
10.1109/TKDE.2017.2749574
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised learning (SSL) concerns the problem of how to improve classifiers' performance through making use of prior knowledge from unlabeled data. Many SSL methods have been developed to integrate unlabeled data into the classifiers based on either the manifold or cluster assumption in recent years. In particular, the graph-based approaches, following the manifold assumption, have achieved a promising performance in many real-world applications. However, most of them work well on small-scale data sets only and lack probabilistic outputs. In this paper, a scalable graph-based SSL framework through sparse Bayesian model is proposed by defining a graph-based sparse prior. Based on the traditional Bayesian inference technique, a sparse Bayesian SSL algorithm ((SBSL)-L-2) is obtained, which can remove the irrelevant unlabeled samples and make probabilistic prediction for out-of-sample data. Moreover, in order to scale (SBSL)-L-2 to large-scale data sets, an incremental (SBSL)-L-2 ((ISBSL)-L-2) is derived. The key idea of (ISBSL)-L-2 is employing an incremental strategy and sequentially selecting parts of unlabeled samples that contribute to the learning instead of using all available unlabeled samples directly. (ISBSL)-L-2 has lower time and space complexities than previous SSL algorithms with the use of all unlabeled samples. Extensive experiments on various data sets verify that our algorithms can achieve comparable classification effectiveness and efficiency with much better scalability. Finally, the generalization error bound is derived based on robustness analysis.
引用
收藏
页码:2758 / 2771
页数:14
相关论文
共 50 条
  • [31] A Sampling Theory Perspective of Graph-Based Semi-Supervised Learning
    Anis, Aamir
    El Gamal, Aly
    Avestimehr, A. Salman
    Ortega, Antonio
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (04) : 2322 - 2342
  • [32] Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning
    Wan, Sheng
    Pan, Shirui
    Yang, Jian
    Gong, Chen
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10049 - 10057
  • [33] Structured optimal graph based sparse feature extraction for semi-supervised learning
    Liu, Zhonghua
    Lai, Zhihui
    Ou, Weihua
    Zhang, Kaibing
    Zheng, Ruijuan
    SIGNAL PROCESSING, 2020, 170
  • [34] Graph-based semi-supervised learning via improving the quality of the graph dynamically
    Liang, Jiye
    Cui, Junbiao
    Wang, Jie
    Wei, Wei
    MACHINE LEARNING, 2021, 110 (06) : 1345 - 1388
  • [35] Graph-based semi-supervised learning via improving the quality of the graph dynamically
    Jiye Liang
    Junbiao Cui
    Jie Wang
    Wei Wei
    Machine Learning, 2021, 110 : 1345 - 1388
  • [36] Graph-based semi-supervised relation extraction
    Chen, Jin-Xiu
    Ji, Dong-Hong
    Ruan Jian Xue Bao/Journal of Software, 2008, 19 (11): : 2843 - 2852
  • [37] PRIVACY-AWARE DISTRIBUTED GRAPH-BASED SEMI-SUPERVISED LEARNING
    Guler, Basak
    Avesthnehr, A. Salman
    Ortega, Antonio
    2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,
  • [38] Graph-based Semi-supervised Learning: Realizing Pointwise Smoothness Probabilistically
    Fang, Yuan
    Chang, Kevin Chen-Chuan
    Lauw, Hady W.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 406 - 414
  • [39] Self-reinforced diffusion for graph-based semi-supervised learning
    Li, Qilin
    Liu, Wanquan
    Li, Ling
    PATTERN RECOGNITION LETTERS, 2019, 125 : 439 - 445
  • [40] A general graph-based semi-supervised learning with novel class discovery
    Nie, Feiping
    Xiang, Shiming
    Liu, Yun
    Zhang, Changshui
    NEURAL COMPUTING & APPLICATIONS, 2010, 19 (04) : 549 - 555