Data selection based on decision tree for SVM classification on large data sets

被引:57
|
作者
Cervantes, Jair [1 ]
Garcia Lamont, Farid [1 ]
Lopez-Chau, Asdrubal [2 ]
Rodriguez Mazahua, Lisbeth [3 ]
Sergio Ruiz, J. [1 ]
机构
[1] CU UAEM Texcoco, Fracc El Tejocote, Texcoco, Mexico
[2] CU UAEM Zumpango, Zumpango 55600, Estado de Mexic, Mexico
[3] Inst Tecnol Orizaba, Div Res & Postgrad Studies, Orizaba 9432, Veracruz, Mexico
关键词
SVM; Classification; Large data sets; SUPPORT VECTOR MACHINES; ALGORITHM; PROPERTY;
D O I
10.1016/j.asoc.2015.08.048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Support Vector Machine (SVM) has important properties such as a strong mathematical background and a better generalization capability with respect to other classification methods. On the other hand, the major drawback of SVM occurs in its training phase, which is computationally expensive and highly dependent on the size of input data set. In this study, a new algorithm to speed up the training time of SVM is presented; this method selects a small and representative amount of data from data sets to improve training time of SVM. The novel method uses an induction tree to reduce the training data set for SVM, producing a very fast and high-accuracy algorithm. According to the results, the proposed algorithm produces results with similar accuracy and in a faster way than the current SVM implementations. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:787 / 798
页数:12
相关论文
共 50 条
  • [31] Imbalanced Data Sets Classification Based on SVM for Sand-Dust Storm Warning
    Xie, Yonghua
    Liu, Yurong
    Fu, Qingqiu
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2015, 2015
  • [32] Unsupervised feature selection for large data sets
    de Amorim, Renato Cordeiro
    PATTERN RECOGNITION LETTERS, 2019, 128 : 183 - 189
  • [33] Agricultural Data Classification Based on Rough Set and Decision Tree Ensemble
    Shi, Lei
    Ma, Xinming
    Duan, Qiguo
    Weng, Mei
    Qiao, Hongbo
    SENSOR LETTERS, 2012, 10 (1-2) : 271 - 278
  • [34] Preprocessing of Tandem Mass Spectrometric Data Based on Decision Tree Classification
    Jing-Fen Zhang1
    2 Graduate Schoolof Chinese Academy of Sciences
    3Institute of Biochemistry and Cell Biology
    Genomics Proteomics & Bioinformatics, 2005, (04) : 231 - 237
  • [35] Study on Decision Tree Land Cover Classification Based on MODIS Data
    Wang Changyao
    Du Zitao
    Liu Zhengjun
    Liu Yonghong
    2008 INTERNATIONAL WORKSHOP ON EARTH OBSERVATION AND REMOTE SENSING APPLICATIONS, 2008, : 211 - +
  • [36] A network big data classification method based on decision tree algorithm
    Xiao N.
    Dai S.
    International Journal of Reasoning-based Intelligent Systems, 2024, 16 (01) : 66 - 73
  • [37] Random feature selection for decision tree classification of multi-temporal SAR data
    Waske, Bjoern
    Schiefer, Sebastian
    Braun, Matthias
    2006 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-8, 2006, : 168 - 171
  • [38] Decision Tree and SVM-Based Data Analytics for Theft Detection in Smart Grid
    Jindal, Anish
    Dua, Amit
    Kaur, Kuljeet
    Singh, Mukesh
    Kumar, Neeraj
    Mishra, S.
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2016, 12 (03) : 1005 - 1016
  • [39] A Geometric Approach to Train SVM on Very Large Data Sets
    Zeng, Zhi-Qiang
    Xu, Hua-Rong
    Xie, Yan-Qi
    Gao, Ji
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 991 - +
  • [40] Prediction of Healthy Blood with Data Mining Classification by using Decision Tree, Naive Bayesian and SVM approaches
    Khalilinezhad, Mandieh
    Minaei, Behrooz
    Vernazza, Gianni
    Dellepiane, Silvana
    SIXTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2014), 2015, 9443