A Selection Metric for semi-supervised learning based on neighborhood construction

被引:16
作者
Emadi, Mona [1 ]
Tanha, Jafar [1 ]
Shiri, Mohammad Ebrahim [2 ,4 ]
Aghdam, Mehdi Hosseinzadeh [3 ]
机构
[1] Univ Tabriz, Comp & Elect Engn Dept, Tabriz, Iran
[2] Islamic Azad Univ, Dept Comp Engn, Borujerd Branch, Borujerd, Iran
[3] Univ Bonab, Dept Comp Engn, Bonab, Iran
[4] Univ AmirKabir, Dept Comp Sci, Tehran, Iran
关键词
Apollonius circle; Semi-supervised classification; Self-training; Support vector machine; Neighborhood construction;
D O I
10.1016/j.ipm.2020.102444
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The present paper focuses on semi-supervised classification problems. Semi-supervised learning is a learning task through both labeled and unlabeled samples. One of the main issues in semi supervised learning is to use a proper selection metric for sampling from the unlabeled data in order to extract informative unlabeled data points. This is indeed vital for the semi-supervised self-training algorithms. Most self-training algorithms employ the probability estimations of the underlying base learners to select high-confidence predictions, which are not always useful for improving the decision boundary. In this study, a novel self-training algorithm is proposed based on a new selection metric using a neighborhood construction algorithm. We select unlabeled data points that are close to the decision boundary. Although these points are not high-confidence based on the probability estimation of the underlying base learner, they are more effective for finding an optimal decision boundary. To assign the correct labels to these data points, we propose an agreement between the classifier predictions and the neighborhood construction algorithm. The proposed approach uses a neighborhood construction algorithm employing peak data points and an Apollonius circle for sampling from unlabeled data. The algorithm then finds the agreement between the classifier predictions and the neighborhood construction algorithm to assign labels to unlabeled data at each iteration of the training process. The experimental results demonstrate that the proposed algorithm can effectively improve the performance of the constructed classification model.
引用
收藏
页数:24
相关论文
共 44 条
[31]   Efficient Cluster-Based Boosting for Semisupervised Classification [J].
Soares, Rodrigo G. F. ;
Chen, Huanhuan ;
Yao, Xin .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) :5667-5680
[32]  
Subramanya A., 2014, Synth Lect Artif Intell Mach Learn, V8, P1
[33]  
Tanha J., 2019, INT J MACHINE LEARNI, P3647
[34]   MSSBoost: A new multiclass boosting to semi-supervised learning [J].
Tanha, Jafar .
NEUROCOMPUTING, 2018, 314 :251-266
[35]   Semi-supervised self-training for decision tree classifiers [J].
Tanha, Jafar ;
van Someren, Maarten ;
Afsarmanesh, Hamideh .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2017, 8 (01) :355-370
[36]   Boosting for multiclass semi-supervised learning [J].
Tanha, Jafar ;
van Someren, Maarten ;
Afsarmanesh, Hamideh .
PATTERN RECOGNITION LETTERS, 2014, 37 :63-77
[37]   A graph-based semi-supervised k nearest-neighbor method for nonlinear manifold distributed data classification [J].
Tu, Enmei ;
Zhang, Yaqian ;
Zhu, Lin ;
Yang, Jie ;
Kasabov, Nikola .
INFORMATION SCIENCES, 2016, 367 :673-688
[38]   A survey on semi-supervised learning [J].
Van Engelen, Jesper E. ;
Hoos, Holger H. .
MACHINE LEARNING, 2020, 109 (02) :373-440
[39]   Visualizing Group Structures in Graphs: A Survey [J].
Vehlow, Corinna ;
Beck, Fabian ;
Weiskopf, Daniel .
COMPUTER GRAPHICS FORUM, 2017, 36 (06) :201-225
[40]   Semi-supervised learning combining transductive support vector machine with active learning [J].
Wang, Xibin ;
Wen, Junhao ;
Alam, Shafiq ;
Jiang, Zhuo ;
Wu, Yingbo .
NEUROCOMPUTING, 2016, 173 :1288-1298