A hybrid classification algorithm by subspace partitioning through semi-supervised decision tree

被引:48
作者
Kim, Kyoungok [1 ]
机构
[1] Seoul Natl Univ Sci & Technol SeoulTech, Int Fus Sch, Informat Technol Management Programme, 232 Gongreungno, Seoul 139743, South Korea
关键词
Decision tree; Semi-supervised decision tree; Inhomogeneous measure; Subspace partitioning; REGRESSION;
D O I
10.1016/j.patcog.2016.04.016
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Among data mining techniques, the decision tree is one of the more widely used methods for building classification models in the real world because of its simplicity and ease of interpretation. However, the method has some drawbacks, including instability, the nonsmooth nature of the decision boundary, and the possibility of overfitting. To overcome these problems, several works have utilized the relative advantages of other classifiers, such as logistic regression, support vector machine, and neural networks, in combination with a decision tree, in hybrid models which avoid the drawbacks of other models. Some hybrid models have used decision trees to quickly and efficiently partition the input space, and many studies have proved the effectiveness of the hybrid methods. However, there is room for further improvement by considering the topological properties of a dataset, because typical decision trees split nodes based only on the target variable. The proposed semi-supervised decision tree splits internal nodes by utilizing both labels and the structural characteristics of data for subspace partitioning, to improve the accuracy of classifiers applied to terminal nodes in the hybrid models. Experimental results confirm the superiority of the proposed algorithm and demonstrate the detailed characteristics of the algorithm. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:157 / 163
页数:7
相关论文
共 28 条
[1]  
[Anonymous], 2007, LARGE SCALE KERNEL M
[2]  
Barros R.C., 2015, Automatic Design of Decision-Tree Induction Algorithms, DOI DOI 10.1007/978-3-319-14231-9
[3]  
Breiman L., 1984, GROUP, V37, P237
[4]   FAST TRAINING ALGORITHMS FOR MULTILAYER NEURAL NETS [J].
BRENT, RP .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (03) :346-354
[5]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[6]  
COX DR, 1958, J R STAT SOC B, V20, P215
[7]  
Cristianini N., 2000, INTRO SUPPORT VECTOR, DOI [10.1017/CBO9780511801389, DOI 10.1017/CBO9780511801389]
[8]   AESTHETIC FREQUENCY CLASSIFICATIONS [J].
DOANE, DP .
AMERICAN STATISTICIAN, 1976, 30 (04) :181-183
[9]   ON THE HISTOGRAM AS A DENSITY ESTIMATOR - L2 THEORY [J].
FREEDMAN, D ;
DIACONIS, P .
ZEITSCHRIFT FUR WAHRSCHEINLICHKEITSTHEORIE UND VERWANDTE GEBIETE, 1981, 57 (04) :453-476
[10]  
Haykin S., 1998, Neural networks: a Comprehensive Foundation