Complete Search for Feature Selection in Decision Trees

被引:0
作者
Ruggieri, Salvatore [1 ]
机构
[1] Univ Pisa, Dept Comp Sci, Largo B Pontecorvo 3, I-56127 Pisa, Italy
关键词
Feature Selection; Decision Trees; Wrapper models; Complete Search; CLASSIFICATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The search space for the feature selection problem in decision tree learning is the lattice of subsets of the available features. We design an exact enumeration procedure of the subsets of features that lead to all and only the distinct decision trees built by a greedy top-down decision tree induction algorithm. The procedure stores, in the worst case, a number of trees linear in the number of features. By exploiting a further pruning of the search space, we design a complete procedure for finding delta-acceptable feature subsets, which depart by at most delta from the best estimated error over any feature subset. Feature subsets with the best estimated error are called best feature subsets. Our results apply to any error estimator function, but experiments are mainly conducted under the wrapper model, in which the misclassification error over a search set is used as an estimator. The approach is also adapted to the design of a computational optimization of the sequential backward elimination heuristic, extending its applicability to large dimensional datasets. The procedures of this paper are implemented in a multi-core data parallel C++ system. We investigate experimentally the properties and limitations of the procedures on a collection of 20 benchmark datasets, showing that oversearching increases both overfitting and instability.
引用
收藏
页数:34
相关论文
共 48 条
[1]   Decision tree building on multi-core using FastFlow [J].
Aldinucci, Marco ;
Ruggieri, Salvatore ;
Torquati, Massimo .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (03) :800-820
[2]   On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems [J].
Amaldi, E ;
Kann, V .
THEORETICAL COMPUTER SCIENCE, 1998, 209 (1-2) :237-260
[3]  
[Anonymous], 2006, STUDIES FUZZINESS SO
[4]  
[Anonymous], 1992, EVALUATION FEATURE S
[5]  
[Anonymous], 2012, INT C DAT MIN SDM 20
[6]  
[Anonymous], 2002, SUBSET SELECTION REG
[7]  
[Anonymous], 2006, ACM INT C P SER, DOI DOI 10.1145/1143844.1143865
[8]  
[Anonymous], 1994, Irrelevant Features and the Subset Selection Problem. pages, DOI 10.1016/B978-1-55860-335-6.50023-4
[9]   Optimal classification trees [J].
Bertsimas, Dimitris ;
Dunn, Jack .
MACHINE LEARNING, 2017, 106 (07) :1039-1082
[10]   Selection of relevant features and examples in machine learning [J].
Blum, AL ;
Langley, P .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :245-271