Constructing support vector machines with missing data

被引:12
|
作者
Stewart, Thomas G. [1 ]
Zeng, Donglin [2 ]
Wu, Michael C. [3 ]
机构
[1] Vanderbilt Univ, Sch Med, Dept Biostat, Nashville, TN 37212 USA
[2] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27515 USA
[3] Fred Hutchinson Canc Res Ctr, Div Publ Hlth Sci, 1124 Columbia St, Seattle, WA 98104 USA
关键词
D O I
10.1002/wics.1430
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Support vector machine (SVM) classification is a statistical learning method which easily accommodates large numbers of predictors and can discover both linear and nonlinear relationships between the predictors and outcomes. A common challenge is constructing an SVM when the training set includes observations with missing predictor values. In this paper, we identify when missing data can bias an SVM classifier. Because the missing data mechanisms which bias SVMs differ from the traditional framework of missing-at-random and missing-not-at-random, we argue for an SVM-specific framework for understanding missing data. Furthermore, we compare a number of missing data strategies for SVMs in a simulation study and real data example, and we make recommendations for SVM users based on the simulation study.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] MULTICLASS SUPPORT VECTOR MACHINES FOR CLASSIFICATION OF ECG DATA WITH MISSING VALUES
    Hejazi, Maryamsadat
    Al-Haddad, S. A. R.
    Singh, Yashwant Prasad
    Hashim, Shaiful Jahari
    Aziz, Ahmad Fazli Abdul
    APPLIED ARTIFICIAL INTELLIGENCE, 2015, 29 (07) : 660 - 674
  • [2] Rough set methods for constructing Support Vector Machines
    Li, YC
    Fang, TJ
    ROUGH SETS, FUZZY SETS, DATA MINING, AND GRANULAR COMPUTING, 2003, 2639 : 334 - 338
  • [3] EXPECTED KERNEL FOR MISSING FEATURES IN SUPPORT VECTOR MACHINES
    Anderson, Hyrum S.
    Gupta, Maya R.
    2011 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2011, : 285 - 288
  • [4] Land-cover classification of partly missing data using support vector machines
    Salberg, Arnt-Borre
    Jenssen, Robert
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2012, 33 (14) : 4471 - 4481
  • [5] Data Augmentation for Support Vector Machines
    Hans, Chris
    BAYESIAN ANALYSIS, 2011, 6 (01): : 37 - 41
  • [6] Support vector machines for dyadic data
    Hochreiter, Sepp
    Obermayer, Klaus
    NEURAL COMPUTATION, 2006, 18 (06) : 1472 - 1510
  • [7] Simplify Support Vector Machines by Iterative Learning and Constructing Reduced Vector Set
    Zhang, Peng
    Liu, Litao
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL VII, 2010, : 71 - 75
  • [8] Simplify Support Vector Machines by Iterative Learning and Constructing Reduced Vector Set
    Zhang, Peng
    Liu, Litao
    2011 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION AND INDUSTRIAL APPLICATION (ICIA2011), VOL II, 2011, : 71 - 75
  • [9] Novel algorithm for constructing support vector machines classification ensemble
    Chen, Pu
    Zhang, Dayong
    Jiang, Zhenhuan
    Wu, Chong
    Journal of Computational Information Systems, 2011, 7 (13): : 4890 - 4897
  • [10] Maximal variation and missing values for componentwise support vector machines
    Pelckmans, K
    Suykens, JAK
    De Moor, B
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 2814 - 2819