Training Set Size Reduction in Large Dataset Problems

被引:0
作者
Chouvatut, Varin [1 ]
Jindaluang, Wattana [1 ]
Boonchieng, Ekkarat [1 ]
机构
[1] Chiang Mai Univ, Theoret & Empir Res Grp, Dept Comp Sci, Ctr Excellence Community Hlth Informat,Fac Sci, Chiang Mai, Thailand
来源
2015 INTERNATIONAL COMPUTER SCIENCE AND ENGINEERING CONFERENCE (ICSEC) | 2015年
关键词
Optimum-Path Forest; Training Set Size Reduction; Graph-based Classification Algorothm; Supervised Learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classifiers have known to be used in various fields of applications. However, the main problem usually found recently is about applying a classifier to large datasets. Thus, the process of reducing size of the training set becomes necessary especially to accelerate the processing time of the classifier. Concerning the problem, this paper proposes a new method which can reduce size of the training set in a large dataset. Our proposed method is improved from a famous graph-based algorithm named Optimum-Path Forest (OPF). Our principal concept of reducing the training set's size is to utilize the Segmented Least Square Algorithm (SLSA) in estimating the tree's shape. From the experimental results, our proposed method could reduce size of the training set by about 7 to 21 percent comparing with the original OPF algorithm while the classification's accuracy decreased insignificantly by only about 0.2 to 0.5 percent. In addition, for some datasets, our method provided even as same degree of accuracy as of the original OPF algorithm.
引用
收藏
页码:234 / 238
页数:5
相关论文
共 8 条
[1]  
Afonso LCS, 2012, IEEE IJCNN
[2]  
Kleinberg J., 2006, ALGORITHM DESIGN
[3]  
Lu L., 2013, 10 WEB INF SYST APPL
[4]   ECG arrhythmia classification based on optimum-path forest [J].
Luz, Eduardo Jose da S. ;
Nunes, Thiago M. ;
de Albuquerque, Victor Hugo C. ;
Papa, Joao P. ;
Menotti, David .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (09) :3561-3573
[5]  
Papa J. P., 2007, 8 INT S MATH MORPH R
[6]  
Papa J. P., 2009, C SOCIEDADEBRESILEIR
[7]  
Papa JP, 2008, LECT NOTES COMPUT SC, V4958, P136, DOI 10.1007/978-3-540-78275-9_12
[8]   A comparison between k-Optimum Path Forest and k-Nearest Neighbors supervised classifiers [J].
Souza, Roberto ;
Rittner, Leticia ;
Lotufo, Roberto .
PATTERN RECOGNITION LETTERS, 2014, 39 :2-10