Semi-supervised support vector machines for unlabeled data classification

被引:103
|
作者
Fung, G [1 ]
Mangasarian, OL [1 ]
机构
[1] Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USA
来源
OPTIMIZATION METHODS & SOFTWARE | 2001年 / 15卷 / 01期
关键词
unlabeled data; classification; support vector machines;
D O I
10.1080/10556780108805809
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A concave minimization approach is proposed for classifying unlabeled data based on the following ideas: (i) A small representative percentage (5% to 10%) of the unlabeled data is chosen by a clustering algorithm and given to an expert or oracle to label. (ii) A linear support vector machine is trained using the small labeled sample while simultaneously assigning the remaining bulk of the unlabeled dataset to one of two classes so as to maximize the margin (distance) between the two bounding planes that determine the separating plane midway between them. This latter problem is formulated as a concave minimization problem on a polyhedral set for which a stationary point is quickly obtained by solving a few (5 to 7) linear programs. Such stationary points turn out to be very effective as evidenced by our computational results which show that clustered concave minimization yields: (a) Test set improvement as high as 20.4% over a linear support vector machine trained on a correspondingly small but randomly chosen subset that is labeled by an expert. (b) Test set correctness averaged to within 5.1% when compared to that of a completely supervised linear support vector machine trained on the entire dataset which has been labeled by an expert.
引用
收藏
页码:29 / 44
页数:16
相关论文
共 50 条
  • [1] Semi-supervised support vector machines for data classification with uncertainty
    Ling, J
    Li, S
    ICEMS 2005: PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON ELECTRICAL MACHINES AND SYSTEMS, VOLS 1-3, 2005, : 2278 - 2281
  • [2] The use of support vector machines in semi-supervised classification
    Bae, Hyunjoo
    Kim, Hyungwoo
    Shin, Seung Jun
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2022, 29 (02) : 193 - 202
  • [3] Modified criterion to select useful unlabeled data for improving semi-supervised support vector machines
    Le, Thanh-Binh
    Kim, Sang-Woon
    PATTERN RECOGNITION LETTERS, 2015, 60-61 : 48 - 56
  • [4] Semi-supervised support vector machines
    Bennett, KP
    Demiriz, A
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 368 - 374
  • [5] Semi-supervised multitemporal classification with support vector machines and genetic algorithms
    Ghoggali, Noureddine
    Melgani, Farid
    IGARSS: 2007 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-12: SENSING AND UNDERSTANDING OUR PLANET, 2007, : 2577 - 2580
  • [6] Semi-supervised Image Classification with Huberized Laplacian Support Vector Machines
    Khan, Inayatullah
    Roth, Peter M.
    Bais, Abdul
    Bischof, Horst
    2013 IEEE 9TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET 2013), 2013, : 205 - 210
  • [7] Semi-supervised Support Vector Machines Regression
    Zhu, Dingzhen
    Wang, Xin
    Chen, Heng
    Wu, Rui
    PROCEEDINGS OF THE 2014 9TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2014, : 2015 - +
  • [8] Distributed semi-supervised support vector machines
    Scardapane, Simone
    Fierimonte, Roberto
    Di Lorenzo, Paolo
    Panella, Massimo
    Uncini, Aurelio
    NEURAL NETWORKS, 2016, 80 : 43 - 52
  • [9] A classification method of fuzzy semi-supervised support vector machines for the problems of imbalance
    Quan, Jing
    Zhao, Shengli
    Su, Liyun
    Lv, Lindai
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2023,
  • [10] A classification method of fuzzy semi-supervised support vector machines for the problems of imbalance
    Quan, Jing
    Zhao, Shengli
    Su, Liyun
    Lv, Lindai
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2024, 22 (01)