Top-k Similar Graph Matching Using TraM in Biological Networks

被引:14
作者
Amin, Mohammad Shafkat [1 ]
Finley, Russell L., Jr. [2 ]
Jamil, Hasan M. [3 ]
机构
[1] Wayne State Univ, Dept Comp Sci, Sunnyvale, CA 94086 USA
[2] Wayne State Univ, Ctr Mol Med & Genet, Detroit, MI 48201 USA
[3] Univ Idaho, Dept Comp Sci, Moscow, ID 83844 USA
基金
美国国家科学基金会;
关键词
Graphs and networks; knowledge and data engineering tools and techniques; bioinformatics; graph and tree search strategies; biology and genetics; PROTEIN-INTERACTION NETWORKS; INTERACTOME NETWORK; DISEASE; COMPLEXES; TOOL; PRIORITIZATION; PREDICTION; ALGORITHM; PATHWAYS; PATTERNS;
D O I
10.1109/TCBB.2012.90
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Many emerging database applications entail sophisticated graph-based query manipulation, predominantly evident in large-scale scientific applications. To access the information embedded in graphs, efficient graph matching tools and algorithms have become of prime importance. Although the prohibitively expensive time complexity associated with exact subgraph isomorphism techniques has limited its efficacy in the application domain, approximate yet efficient graph matching techniques have received much attention due to their pragmatic applicability. Since public domain databases are noisy and incomplete in nature, inexact graph matching techniques have proven to be more promising in terms of inferring knowledge from numerous structural data repositories. In this paper, we propose a novel technique called TraM for approximate graph matching that off-loads a significant amount of its processing on to the database making the approach viable for large graphs. Moreover, the vector space embedding of the graphs and efficient filtration of the search space enables computation of approximate graph similarity at a throw-away cost. We annotate nodes of the query graphs by means of their global topological properties and compare them with neighborhood biased segments of the data-graph for proper matches. We have conducted experiments on several real data sets, and have demonstrated the effectiveness and efficiency of the proposed method
引用
收藏
页码:1790 / 1804
页数:15
相关论文
共 60 条
[1]  
Agrawal R., 1987, Proceedings of the Thirteenth International Conference on Very Large Data Bases: 1987 13th VLDB, P255
[2]  
[Anonymous], 2005, P 14 INT C WORLD WID, DOI 10.1145/1060745.1060827
[3]  
[Anonymous], 1990, COMPUT INTRACTABILIT
[4]   An automated method for finding molecular complexes in large protein interaction networks [J].
Bader, GD ;
Hogue, CW .
BMC BIOINFORMATICS, 2003, 4 (1)
[5]  
Boldi P., 2007, P WEB INF RETR LIN A
[6]   Mass and Information Feedbacks through Receptor Endocytosis Govern Insulin Signaling as Revealed Using a Parameter-free Modeling Framework [J].
Brannmark, Cecilia ;
Palmer, Robert ;
Glad, S. Torkel ;
Cedersund, Gunnar ;
Stralfors, Peter .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2010, 285 (26) :20171-20179
[7]   The BioGRID interaction database:: 2008 update [J].
Breitkreutz, Bobby-Joe ;
Stark, Chris ;
Reguly, Teresa ;
Boucher, Lorrie ;
Breitkreutz, Ashton ;
Livstone, Michael ;
Oughtred, Rose ;
Lackner, Daniel H. ;
Bahler, Jurg ;
Wood, Valerie ;
Dolinski, Kara ;
Tyers, Mike .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D637-D640
[8]   Unequal evolutionary conservation of human protein interactions in interologous networks. [J].
Brown, Kevin V. ;
Jurisica, Igor .
GENOME BIOLOGY, 2007, 8 (05)
[9]  
Brun C, 2003, GENOME BIOL, V5, pR61
[10]   Topological structure analysis of the protein-protein interaction network in budding yeast [J].
Bu, DB ;
Zhao, Y ;
Cai, L ;
Xue, H ;
Zhu, XP ;
Lu, HC ;
Zhang, JF ;
Sun, SW ;
Ling, LJ ;
Zhang, N ;
Li, GJ ;
Chen, RS .
NUCLEIC ACIDS RESEARCH, 2003, 31 (09) :2443-2450