Data selection based on decision tree for SVM classification on large data sets

被引:59
作者
Cervantes, Jair [1 ]
Garcia Lamont, Farid [1 ]
Lopez-Chau, Asdrubal [2 ]
Rodriguez Mazahua, Lisbeth [3 ]
Sergio Ruiz, J. [1 ]
机构
[1] CU UAEM Texcoco, Fracc El Tejocote, Texcoco, Mexico
[2] CU UAEM Zumpango, Zumpango 55600, Estado de Mexic, Mexico
[3] Inst Tecnol Orizaba, Div Res & Postgrad Studies, Orizaba 9432, Veracruz, Mexico
关键词
SVM; Classification; Large data sets; SUPPORT VECTOR MACHINES; ALGORITHM; PROPERTY;
D O I
10.1016/j.asoc.2015.08.048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Support Vector Machine (SVM) has important properties such as a strong mathematical background and a better generalization capability with respect to other classification methods. On the other hand, the major drawback of SVM occurs in its training phase, which is computationally expensive and highly dependent on the size of input data set. In this study, a new algorithm to speed up the training time of SVM is presented; this method selects a small and representative amount of data from data sets to improve training time of SVM. The novel method uses an induction tree to reduce the training data set for SVM, producing a very fast and high-accuracy algorithm. According to the results, the proposed algorithm produces results with similar accuracy and in a faster way than the current SVM implementations. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:787 / 798
页数:12
相关论文
共 70 条
[11]  
Cervantes J., 2009, THESIS I POLITECNICO
[12]  
Cervantes J, 2011, LECT NOTES ARTIF INT, V7094, P187, DOI 10.1007/978-3-642-25324-9_16
[13]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[14]  
Chang F, 2010, J MACH LEARN RES, V11, P2935
[15]   Fuzzy Support Vector Machine for bankruptcy prediction [J].
Chaudhuri, Arindam ;
De, Kajal .
APPLIED SOFT COMPUTING, 2011, 11 (02) :2472-2486
[16]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[17]   Scaling large learning problems with hard parallel mixtures [J].
Collobert, R ;
Bengio, Y ;
Bengio, S .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2003, 17 (03) :349-365
[18]  
Crisp Burges, 2000, NIPS, V12, P244
[19]  
Cristianini Nello, 2000, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, DOI DOI 10.1017/CB09780511801389
[20]  
De Cao Tran, 2010, Proceedings 2010 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), P65, DOI 10.1109/ICFHR.2010.16