Semi-Supervised Self-Training Method Based on an Optimum-Path Forest

被引:31
作者
Li, Junnan [1 ]
Zhu, Qingsheng [1 ]
机构
[1] Chongqing Univ, Dept Comp Sci, Chongqing 400044, Peoples R China
关键词
Self-training method; semi-supervised classification; optimum-path forest; semi-supervised learning; CLASSIFICATION; NEIGHBOR; ALGORITHM; IMPROVE;
D O I
10.1109/ACCESS.2019.2903839
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semi-supervised self-training method can train an effective classifier by exploiting labeled and unlabeled samples. Recently, a self-training method based on density peaks of data (STDP) is proposed. However, it still suffers from some shortcomings to be addressed. For example, STDP is affected by cut-off distance d(c). As a result, it is tricky for STDP to select an optimal parameter on each data set. Furthermore, STDP has a poor performance on data sets with some variations in density because of cut-off distance d(c). In order to solve these problems, we present a new self-training method which connects unlabeled and labeled samples as vertexes of an optimum path forest to discover the underlying structure of feature space. Furthermore, the underlying structure of the feature space is used to guide the self-training method to train a classifier. Compared with STDP, our algorithm is free of parameters and can work better on data sets with some variations in density. Moreover, we are surprised to find that our algorithm also has some advantages in dealing with overlapping data sets. The experimental results on real data sets clearly demonstrate that our algorithm has better performance than some previous works in improving the performance of base classifiers of k-nearest neighbor, support vector machine and cart.
引用
收藏
页码:36388 / 36399
页数:12
相关论文
共 39 条
[1]   Help-Training for semi-supervised support vector machines [J].
Adankon, Mathias M. ;
Cheriet, Mohamed .
PATTERN RECOGNITION, 2011, 44 (09) :2220-2230
[2]   Multi-label semi-supervised classification through optimum-path forest [J].
Amorim, Willian P. ;
Falcao, Alexandre X. ;
Papa, Joao P. .
INFORMATION SCIENCES, 2018, 465 :86-104
[3]   Improving semi-supervised learning through optimum connectivity [J].
Amorim, Willian P. ;
Falcao, Alexandre X. ;
Papa, Joao P. ;
Carvalho, Marcelo H. .
PATTERN RECOGNITION, 2016, 60 :72-85
[4]   An improved optimum-path forest clustering algorithm for remote sensing image segmentation [J].
Chen, Siya ;
Sun, Tieli ;
Yang, Fengqin ;
Sun, Hongguang ;
Guan, Yu .
COMPUTERS & GEOSCIENCES, 2018, 112 :38-46
[5]   Natural neighbor-based clustering algorithm with local representatives [J].
Cheng, Dongdong ;
Zhu, Qingsheng ;
Huang, Jinlong ;
Yang, Lijun ;
Wu, Quanwang .
KNOWLEDGE-BASED SYSTEMS, 2017, 123 :238-253
[6]   Effective semi-supervised learning strategies for automatic sentence segmentation [J].
Dalva, Dogan ;
Guz, Umit ;
Gurkan, Hakan .
PATTERN RECOGNITION LETTERS, 2018, 105 :76-86
[7]   Self-training on refined clause patterns for relation extraction [J].
Duc-Thuan Vo ;
Bagheri, Ebrahim .
INFORMATION PROCESSING & MANAGEMENT, 2018, 54 (04) :686-706
[8]   Application of semi-supervised fuzzy c-means method in clustering multivariate geochemical data, a case study from the Dalli Cu-Au porphyry deposit in central Iran [J].
Fatehi, Moslem ;
Asadi, Hooshang H. .
ORE GEOLOGY REVIEWS, 2017, 81 :245-255
[9]   Safety-aware Graph-based Semi-Supervised Learning [J].
Gan, Haitao ;
Li, Zhenhua ;
Wu, Wei ;
Luo, Zhizeng ;
Huang, Rui .
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 107 :243-254
[10]   Using clustering analysis to improve semi-supervised classification [J].
Gan, Haitao ;
Sang, Nong ;
Huang, Rui ;
Tong, Xiaojun ;
Dan, Zhiping .
NEUROCOMPUTING, 2013, 101 :290-298