Graph Propositionalization for Random Forests

被引:2
作者
Karunaratne, Thashmee [1 ]
Bostrom, Henrik [1 ]
机构
[1] Stockholm Univ, Dept Comp & Syst Sci, SE-16440 Kista, Sweden
来源
EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS | 2009年
关键词
graph propositionalization; random forests; structured data;
D O I
10.1109/ICMLA.2009.113
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph propositionalization methods transform structured and relational data into fixed-length feature vectors that can be used by standard machine learning methods. However, the choice of propositionalization method may have a significant impact on the performance of the resulting classifier. Six different propositionalization methods are evaluated when used in conjunction with random forests. The empirical evaluation shows that the choice of propositionalization method has a significant impact on the resulting accuracy for structured data sets. The results furthermore show that the maximum frequent itemset approach and a combination of this approach and maximal common substructures turn out to be the most successful propositionalization methods for structured data, each significantly outperforming the four other considered methods.
引用
收藏
页码:196 / 201
页数:6
相关论文
共 21 条
  • [1] Agarwal R., 1994, VLDB, V487, P499, DOI DOI 10.5555/645920.672836
  • [2] Bournaud I, 2003, LECT NOTES ARTIF INT, V2583, P1
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] Breiman L., 1984, WADSWORTH
  • [5] BURDICK D, 2001, P 17 INT C DAT ENG H
  • [6] Caruana R., 2005, International Conferences on Machine Learning, P161, DOI DOI 10.1145/1143844.1143865
  • [7] Demsar J, 2006, J MACH LEARN RES, V7, P1
  • [8] Frank E, 2005, DATA MINING AND KNOWLEDGE DISCOVERY HANDBOOK, P1305, DOI 10.1007/0-387-25465-X_62
  • [9] Goethals B., 2003, FIMI 03, V90
  • [10] IAN H, 2005, WITTEN EIBE FRANK DA