Graph-Based Semi-Supervised Learning with Bipartite Graph for Large-Scale Data and Prediction of Unseen Data

被引:0
作者
Alemi, Mohammad [1 ]
Bosaghzadeh, Alireza [1 ]
Dornaika, Fadi [2 ,3 ]
机构
[1] Shahid Rajaee Teacher Training Univ, Dept Comp Engn, Tehran, Iran
[2] Univ Basque Country, Fac Comp Sci, San Sebastian 20018, Spain
[3] Basque Fdn Sci, Ikerbasque, Bilbao 48009, Spain
关键词
large-scale data; graph construction; bipartite graph; label propagation; CLASSIFICATION;
D O I
10.3390/info15100591
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, considerable attention has been directed toward graph-based semi-supervised learning (GSSL) as an effective approach for data labeling. Despite the progress achieved by current methodologies, several limitations persist. Firstly, many studies treat all samples equally in terms of weight and influence, disregarding the potential increased importance of samples near decision boundaries. Secondly, the detection of outlier-labeled data is crucial, as it can significantly impact model performance. Thirdly, existing models often struggle with predicting labels for unseen test data, restricting their utility in practical applications. Lastly, most graph-based algorithms rely on affinity matrices that capture pairwise similarities across all data points, thus limiting their scalability to large-scale databases. In this paper, we propose a novel GSSL algorithm tailored for large-scale databases, leveraging anchor points to mitigate the challenges posed by large affinity matrices. Additionally, our method enhances the influence of nodes near decision boundaries by assigning different weights based on their importance and using a mapping function from feature space to label space. Leveraging this mapping function enables direct label prediction for test samples without requiring iterative learning processes. Experimental evaluations on two extensive datasets (Norb and Covtype) demonstrate that our approach is scalable and outperforms existing GSSL methods in terms of performance metrics.
引用
收藏
页数:19
相关论文
共 32 条
[1]  
Aromal M., 2021, P 2021 2 INT C EL SU
[2]   Joint auto-weighted graph fusion and scalable semi-supervised learning [J].
Bahrami, Saeedeh ;
Dornaika, Fadi ;
Bosaghzadeh, Alireza .
INFORMATION FUSION, 2021, 66 :213-228
[3]  
Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[4]   Mitigation of Effects of Occlusion on Object Recognition with Deep Neural Networks through Low-Level Image Completion [J].
Chandler, Benjamin ;
Mingolla, Ennio .
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
[5]  
Chen DL, 2021, ADV NEUR IN
[6]   Weighted samples based semi-supervised classification [J].
Chen, Xia ;
Yu, Guoxian ;
Tan, Qiaoyu ;
Wang, Jun .
APPLIED SOFT COMPUTING, 2019, 79 :46-58
[7]   Semi-supervised Domain Adaptation on Manifolds [J].
Cheng, Li ;
Pan, Sinno Jialin .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (12) :2240-2249
[8]   Fast Semisupervised Learning With Bipartite Graph for Large-Scale Data [J].
He, Fang ;
Nie, Feiping ;
Wang, Rong ;
Li, Xuelong ;
Jia, Weimin .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (02) :626-638
[9]  
Joachims T, 1999, MACHINE LEARNING, PROCEEDINGS, P200
[10]   Label propagation through minimax paths for scalable semi-supervised learning [J].
Kim, Kye-Hyeon ;
Choi, Seungjin .
PATTERN RECOGNITION LETTERS, 2014, 45 :17-25