Learning with Similarity Functions on Graphs using Matchings of Geometric Embeddings

被引:22
作者
Johansson, Fredrik D. [1 ]
Dubhashi, Devdatt [1 ]
机构
[1] Chalmers Univ Technol, Gothenburg, Sweden
来源
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2015年
关键词
Matchings; Similarity functions; Graphs; Geometric embeddings; Classification; KERNELS;
D O I
10.1145/2783258.2783341
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We develop and apply the Balcan Blum Srebro (BBS) theory of classification via similarity functions (which are not necessarily kernels) to the problem of graph classification. First we place the BBS theory into the unifying framework of optimal transport theory. This also opens the way to exploit coupling methods for establishing properties required of a good similarity function as per their definition. Next, we use the approach to the problem of graph classification via geometric embeddings such as the Laplacian, pseudo inverse Laplacian and the Lovasz orthogonal labellings. We consider the similarity function given by optimal and near optimal matchings with respect to Euclidean distance of the corresponding embeddings of the graphs in high dimensions. We use optimal couplings to rigorously establish that this yields a "good" similarity measure in the BBS sense for two well known families of graphs. Further, we show that the similarity yields better classification accuracy in practice, on these families, than matchings of other well-known graph embeddings. Finally we perform an extensive empirical evaluation on benchmark data sets where we show that classifying graphs using matchings of geometric embeddings outperforms the previous state of the art methods.
引用
收藏
页码:467 / 476
页数:10
相关论文
共 45 条
  • [1] Agarwal Pankaj K, 2014, STOC 14
  • [2] [Anonymous], 2009, PMLR
  • [3] [Anonymous], 2003, ACM SIGKDD Explorations Newslett.
  • [4] [Anonymous], 2009, Neural Information Processing Sys-tems (NeurIPS)
  • [5] [Anonymous], 2001, J. Am. Stat. Assoc.
  • [6] [Anonymous], 2005, 5 IEEE INT C DATA MI, DOI DOI 10.1109/ICDM.2005.132
  • [7] [Anonymous], 2008, CoRR
  • [8] A theory of learning with similarity functions
    Balcan, Maria-Florina
    Blum, Avrim
    Srebro, Nathan
    [J]. MACHINE LEARNING, 2008, 72 (1-2) : 89 - 112
  • [9] Bellet A., 2012, P 29 INT C MACH LEAR
  • [10] Protein function prediction via graph kernels
    Borgwardt, KM
    Ong, CS
    Schönauer, S
    Vishwanathan, SVN
    Smola, AJ
    Kriegel, HP
    [J]. BIOINFORMATICS, 2005, 21 : I47 - I56