Data preprocessing in semi-supervised SVM classification

被引:27
作者
Astorino, A. [2 ]
Gorgone, E. [1 ]
Gaudioso, M. [1 ]
Pallaschke, D. [3 ]
机构
[1] Univ Calabria, Dipartimento Elettron Informat & Sistemist, I-87036 Arcavacata Di Rende, CS, Italy
[2] CNR, Ist Calcolo & Reti Ad Alte Prestaz, I-87036 Arcavacata Di Rende, CS, Italy
[3] Univ Karlsruhe, Inst Operat Res, D-76128 Karlsruhe, Germany
关键词
data classification; semi-supervised learning; SVM; nonsmooth optimization; OPTIMIZATION TECHNIQUES;
D O I
10.1080/02331931003692557
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The literature in the area of the semi-supervised binary classification has demonstrated that useful information can be gathered not only from those samples whose class membership is known in advance, but also from the unlabelled ones. In fact, in the support vector machine, semi-supervised models with both labelled and unlabelled samples contribute to the definition of an appropriate optimization model for finding a good quality separating hyperplane. In particular, the optimization approaches which have been devised in this context are basically of two types: a mixed integer linear programming problem, and a continuous optimization problem characterized by an objective function which is nonsmooth and nonconvex. Both such problems are hard to solve whenever the number of the unlabelled points increases. In this article, we present a data preprocessing technique which has the objective of reducing the number of unlabelled points to enter the computational model, without worsening too much the classification performance of the overall process. The approach is based on the concept of separating sets and can be implemented with a reasonable computational effort. The results of the numerical experiments on several benchmark datasets are also reported.
引用
收藏
页码:143 / 151
页数:9
相关论文
共 50 条
  • [41] Semi-supervised classification with pairwise constraints
    Gong, Chen
    Fu, Keren
    Wu, Qiang
    Tu, Enmei
    Yang, Jie
    NEUROCOMPUTING, 2014, 139 : 130 - 137
  • [42] A semi-supervised classification technique based on interacting forces
    Cupertino, Thiago H.
    Gueleri, Roberto
    Zhao, Liang
    NEUROCOMPUTING, 2014, 127 : 43 - 51
  • [43] An efficient semi-supervised classification approach for hyperspectral imagery
    Tan, Kun
    Li, Erzhu
    Du, Qian
    Du, Peijun
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2014, 97 : 36 - 45
  • [44] Fast semi-supervised SVM classifiers using a priori metric information
    Vural, Volkan
    Fung, Glenn
    Dy, Jennifer G.
    Rao, Bharat
    OPTIMIZATION METHODS & SOFTWARE, 2008, 23 (04) : 521 - 532
  • [45] Semi-supervised sequence classification with HMMs
    Zhong, S
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2005, 19 (02) : 165 - 182
  • [46] Hyperspectral Image Classification with Imbalanced Data Based on Semi-Supervised Learning
    Zheng, Xiaorou
    Jia, Jianxin
    Chen, Jinsong
    Guo, Shanxin
    Sun, Luyi
    Zhou, Chan
    Wang, Yawei
    APPLIED SCIENCES-BASEL, 2022, 12 (08):
  • [47] Adaptive Semi-Supervised Classifier Ensemble for High Dimensional Data Classification
    Yu, Zhiwen
    Zhang, Yidong
    You, Jane
    Chen, C. L. Philip
    Wong, Hau-San
    Han, Guoqiang
    Zhang, Jun
    IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (02) : 366 - 379
  • [48] Semi-supervised Collective Classification in Multi-attribute Network Data
    Shaokai Wang
    Yunming Ye
    Xutao Li
    Xiaohui Huang
    Raymond Y. K. Lau
    Neural Processing Letters, 2017, 45 : 153 - 172
  • [49] Semi-supervised Collective Classification in Multi-attribute Network Data
    Wang, Shaokai
    Ye, Yunming
    Li, Xutao
    Huang, Xiaohui
    Lau, Raymond Y. K.
    NEURAL PROCESSING LETTERS, 2017, 45 (01) : 153 - 172
  • [50] Weighted Pseudo Labeled Data and Mutual Learning for Semi-Supervised Classification
    Mo, Jianwen
    Gan, Yuwan
    Yuan, Hua
    IEEE ACCESS, 2021, 9 : 36522 - 36534