Mining Optimal Decision Trees from Itemset Lattices

被引:0
|
作者
Nijssen, Siegfried [1 ]
Fromont, Elisa [1 ]
机构
[1] Katholieke Univ Leuven, Leuven, Belgium
来源
KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2007年
关键词
Decision trees; Frequent itemsets; Formal concepts; Constraint-based mining;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present DL8, an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal trees has high theoretical complexity, until now few efforts have been made to compute such trees for real-world datasets. An exact algorithm is of both scientific and practical interest. Front a scientific point of view, it can be used as a gold standard to evaluate the performance of heuristic constraint-based decision tree learners and to gain new insight in traditional decision tree learners. From the application point of view, it can be used to discover trees that cannot be found by heuristic decision tree learners. The key idea behind our algorithm is that there is a relation between constraints on decision trees and constraints on itemsets. We show that optimal decision trees can be extracted from lattices of itemsets in linear time. We give several strategies to efficiently build these lattices. Experiments show that under the same constraints, DL8 obtains better results than C4.5, which confirms that exhaustive search does not always imply overfitting. The results also show that DL8 is a useful and interesting tool to learn decision trees under constraints.
引用
收藏
页码:530 / 539
页数:10
相关论文
共 50 条
  • [41] DECISION TREES IN THE ANALYSIS OF THE INTENSITY OF DAMAGE TO PORTAL FRAME BUILDINGS IN MINING AREAS
    Firek, Karol
    Rusek, Janusz
    Wodynski, Aleksander
    ARCHIVES OF MINING SCIENCES, 2015, 60 (03) : 847 - 857
  • [42] Automated decision-making with DMN: from decision trees to decision tables
    Etinger, D.
    Simic, S. D.
    Buljubasic, L.
    2019 42ND INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2019, : 1309 - 1313
  • [43] Frequent itemset mining over time-sensitive streams
    Li, Hai-Feng
    Zhang, Ning
    Zhu, Jian-Ming
    Cao, Huai-Hu
    Jisuanji Xuebao/Chinese Journal of Computers, 2012, 35 (11): : 2283 - 2293
  • [44] Use of Frequent Itemset Mining Techniques to Analyze Business Processes
    Bartik, Vladimir
    Pospisil, Milan
    2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K), 2015, : 273 - 280
  • [45] A frequent itemset mining algorithm based on composite granular computing
    Wu, Hongjuan
    Liu, Yulu
    Yan, Pei
    Fang, Gang
    Zhong, Jing
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2018, 18 (01) : 247 - 257
  • [46] Learning Decision Trees from Distributed Datasets
    Xie Hongxia
    Shi Liping
    Meng Fanrong
    Wang Chun
    DCABES 2008 PROCEEDINGS, VOLS I AND II, 2008, : 96 - +
  • [47] HDFS Framework for Efficient Frequent Itemset Mining Using MapReduce
    Kulkarni, Prajakta G.
    Khonde, Shraddha R.
    2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 171 - 178
  • [48] Lexicographic Logical Multi-Hashing For Frequent Itemset Mining
    Chaudhary, Shailza
    Sharma, Abhilasha
    Singh, Ravideep
    Kumar, Pardeep
    2015 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION & AUTOMATION (ICCCA), 2015, : 563 - 568
  • [49] Extraction of attribute importance from satisfaction surveys with data mining techniques: a comparison between neural networks and decision trees
    de Ona, Juan
    de Ona, Rocio
    Garrido, Concepcion
    TRANSPORTATION LETTERS-THE INTERNATIONAL JOURNAL OF TRANSPORTATION RESEARCH, 2017, 9 (01): : 39 - 48
  • [50] Inductive data mining: automatic generation of decision trees from data for QSAR modelling and process historical data analysis
    Ma, Chao Y.
    Buontempo, Frances V.
    Wang, Xue Z.
    INTERNATIONAL JOURNAL OF MODELLING IDENTIFICATION AND CONTROL, 2011, 12 (1-2) : 101 - 106