Data selection based on decision tree for SVM classification on large data sets

被引:57
|
作者
Cervantes, Jair [1 ]
Garcia Lamont, Farid [1 ]
Lopez-Chau, Asdrubal [2 ]
Rodriguez Mazahua, Lisbeth [3 ]
Sergio Ruiz, J. [1 ]
机构
[1] CU UAEM Texcoco, Fracc El Tejocote, Texcoco, Mexico
[2] CU UAEM Zumpango, Zumpango 55600, Estado de Mexic, Mexico
[3] Inst Tecnol Orizaba, Div Res & Postgrad Studies, Orizaba 9432, Veracruz, Mexico
关键词
SVM; Classification; Large data sets; SUPPORT VECTOR MACHINES; ALGORITHM; PROPERTY;
D O I
10.1016/j.asoc.2015.08.048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Support Vector Machine (SVM) has important properties such as a strong mathematical background and a better generalization capability with respect to other classification methods. On the other hand, the major drawback of SVM occurs in its training phase, which is computationally expensive and highly dependent on the size of input data set. In this study, a new algorithm to speed up the training time of SVM is presented; this method selects a small and representative amount of data from data sets to improve training time of SVM. The novel method uses an induction tree to reduce the training data set for SVM, producing a very fast and high-accuracy algorithm. According to the results, the proposed algorithm produces results with similar accuracy and in a faster way than the current SVM implementations. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:787 / 798
页数:12
相关论文
共 50 条
  • [1] Data Selection Using Decision Tree for SVM Classification
    Lopez-Chau, Asdrubal
    Lopez-Garcia, Lourdes
    Cervantes, Jair
    Li, Xiaoou
    Yu, Wen
    2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 742 - 749
  • [2] SAT-based Decision Tree Learning for Large Data Sets
    Schidler, Andre
    Szeider, Stefan
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 80 : 875 - 918
  • [3] SAT-based Decision Tree Learning for Large Data Sets
    Schidler, Andre
    Szeider, Stefan
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 3904 - 3912
  • [4] SAT-based Decision Tree Learning for Large Data Sets
    Schidler, André
    Szeider, Stefan
    Journal of Artificial Intelligence Research, 2024, 80 : 875 - 918
  • [5] Decision tree learning on very large data sets
    Hall, LO
    Chawla, N
    Bowyer, KW
    1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 2579 - 2584
  • [6] Efficient Decision Tree Based Data Selection and Support Vector Machine Classification
    Arumugam, P.
    Jose, P.
    MATERIALS TODAY-PROCEEDINGS, 2018, 5 (01) : 1679 - 1685
  • [7] Neighborhood Preprocessing SVM for Large-scale Data Sets Classification
    Chen, Guangxi
    Xu, Jian
    Xiang, Xiaolin
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 245 - +
  • [8] SVM Classification for Large Data Sets by Considering Models of Classes Distribution
    Cervantes, Jair
    Li, Xiaoou
    Yu, Wen
    MICAI 2007: SIXTH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, : 51 - +
  • [9] Object-based Land Use and Land Cover Mapping from LiDAR Data and Orthophoto Application of Decision Tree-Based Data Selection for SVM Classification
    David, Lawrence Charlemagne G.
    Ballado, Alejandro H., Jr.
    2016 IEEE REGION 10 HUMANITARIAN TECHNOLOGY CONFERENCE (R10-HTC), 2016,
  • [10] Using Locality-Sensitive Hashing for SVM Classification of Large Data Sets
    Gonzalez-Lima, Maria D.
    Ludena, Carenne C.
    MATHEMATICS, 2022, 10 (11)