Data selection based on decision tree for SVM classification on large data sets

被引:57
|
作者
Cervantes, Jair [1 ]
Garcia Lamont, Farid [1 ]
Lopez-Chau, Asdrubal [2 ]
Rodriguez Mazahua, Lisbeth [3 ]
Sergio Ruiz, J. [1 ]
机构
[1] CU UAEM Texcoco, Fracc El Tejocote, Texcoco, Mexico
[2] CU UAEM Zumpango, Zumpango 55600, Estado de Mexic, Mexico
[3] Inst Tecnol Orizaba, Div Res & Postgrad Studies, Orizaba 9432, Veracruz, Mexico
关键词
SVM; Classification; Large data sets; SUPPORT VECTOR MACHINES; ALGORITHM; PROPERTY;
D O I
10.1016/j.asoc.2015.08.048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Support Vector Machine (SVM) has important properties such as a strong mathematical background and a better generalization capability with respect to other classification methods. On the other hand, the major drawback of SVM occurs in its training phase, which is computationally expensive and highly dependent on the size of input data set. In this study, a new algorithm to speed up the training time of SVM is presented; this method selects a small and representative amount of data from data sets to improve training time of SVM. The novel method uses an induction tree to reduce the training data set for SVM, producing a very fast and high-accuracy algorithm. According to the results, the proposed algorithm produces results with similar accuracy and in a faster way than the current SVM implementations. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:787 / 798
页数:12
相关论文
共 50 条
  • [41] Bagging Decision Trees on Data Sets with Classification Noise
    Abellan, Joaquin
    Masegosa, Andres R.
    FOUNDATIONS OF INFORMATION AND KNOWLEDGE SYSTEMS, PROCEEDINGS, 2010, 5956 : 248 - 265
  • [42] A SVM based classification method for homogeneous data
    Li, Huan
    Chung, Fu-Lai
    Wang, Shitong
    APPLIED SOFT COMPUTING, 2015, 36 : 228 - 235
  • [43] Microarray data classification using automatic SVM kernel selection
    Nahar, Jesmin
    Ali, Shawkat
    Chen, Yi-Ping Phoebe
    DNA AND CELL BIOLOGY, 2007, 26 (10) : 707 - 712
  • [44] SVM classification for imbalanced data sets using a multiobjective optimization framework
    Ayşegül Aşkan
    Serpil Sayın
    Annals of Operations Research, 2014, 216 : 191 - 203
  • [45] SVM classification for imbalanced data sets using a multiobjective optimization framework
    Askan, Aysegul
    Sayin, Serpil
    ANNALS OF OPERATIONS RESEARCH, 2014, 216 (01) : 191 - 203
  • [46] A classification method based on non-linear SVM decision tree
    Zhao, Hui
    Yao, Yong
    Liu, Zhijing
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2007, : 635 - 638
  • [47] The Application Based on Decision Tree SVM for Multi-class Classification
    Hou Huifang
    Han Ping
    Cao Dan
    PROCEEDINGS OF THE 2015 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER ENGINEERING AND ELECTRONICS (ICECEE 2015), 2015, 24 : 1656 - 1660
  • [48] Fuzzy Hoeffding Decision Tree for Data Stream Classification
    Ducange, Pietro
    Marcelloni, Francesco
    Pecori, Riccardo
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01) : 946 - 964
  • [49] Spatial Data Classification Using Decision Tree Models
    Gupta, Mahendra
    Minz, S.
    2017 CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (CICT), 2017,
  • [50] A Fuzzy Decision Tree Approach for Imbalanced Data Classification
    Sardari, Sahar
    Eftekhari, Mahdi
    2016 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2016, : 292 - 297