Identifying similar-bicliques in bipartite graphs

被引:2
作者
Yao, Kai [1 ]
Chang, Lijun [1 ]
Yu, Jeffrey Xu [2 ]
机构
[1] Univ Sydney, Sydney, Australia
[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China
关键词
Maximal similar-biclique enumeration; Maximal biclique enumeration; Branch-reduce-and-bound; Structural similarity; Large bipartite graph; DATA-MANAGEMENT; DATABASE; LOCKING; INDEX;
D O I
10.1007/s00778-023-00834-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Bipartite graphs have been widely used to model the relationship between entities of different types, where vertices are partitioned into two disjoint sets/sides. Finding dense subgraphs in a bipartite graph is of great significance and encompasses many applications. However, none of the existing dense bipartite subgraph models consider similarity between vertices from the same side, and as a result, the identified results may include vertices that are not similar to each other. In this work, we formulate the notion of similar-biclique which is a special kind of biclique where all vertices from a designated side are similar to each other and aim to enumerate all similar-bicliques. The naive approach of first enumerating all maximal bicliques and then extracting all maximal similar-bicliques from them is inefficient, as enumerating maximal bicliques is already time consuming. We propose a backtracking algorithm MSBE to directly enumerate maximal similar-bicliques and power it by vertex reduction and optimization techniques. In addition, we design a novel index structure to speed up a time-critical operation of MSBE, as well as to speed up vertex reduction. Efficient index construction algorithms are developed. To handle dynamic graph updates, we also propose algorithms and optimization techniques for maintaining our index. Finally, we parallelize our index construction algorithms to exploit multiple CPU cores. Extensive experiments on 17 bipartite graphs as well as case studies are conducted to demonstrate the effectiveness and efficiency of our model and algorithms.
引用
收藏
页码:703 / 726
页数:24
相关论文
共 59 条
  • [41] THE MAXIMUM COVERAGE LOCATION PROBLEM
    MEGIDDO, N
    ZEMEL, E
    HAKIMI, SL
    [J]. SIAM JOURNAL ON ALGEBRAIC AND DISCRETE METHODS, 1983, 4 (02): : 253 - 261
  • [42] The maximum edge biclique problem is NP-complete
    Peeters, R
    [J]. DISCRETE APPLIED MATHEMATICS, 2003, 131 (03) : 651 - 654
  • [43] Salton Gerard, 1989, Automatic text processing: The transformation, analysis, and retrieval of, P169
  • [44] Obtaining maximal concatenated phylogenetic data sets from large sequence databases
    Sanderson, MJ
    Driskell, AC
    Ree, RH
    Eulenstein, O
    Langley, S
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2003, 20 (07) : 1036 - 1042
  • [45] Peeling Bipartite Networks for Dense Subgraph Discovery
    Sariyuce, Ahmet Erdem
    Pinar, Ali
    [J]. WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, : 504 - 512
  • [46] Satuluri V., 2011, P 2011 ACM SIGMOD IN, P721, DOI DOI 10.1145/1989323.1989399
  • [47] The worst-case time complexity for generating all maximal cliques and computational experiments
    Tomita, Etsuji
    Tanaka, Akira
    Takahashi, Haruhisa
    [J]. THEORETICAL COMPUTER SCIENCE, 2006, 363 (01) : 28 - 42
  • [48] Parallel Index-Based Structural Graph Clustering and Its Approximation
    Tseng, Tom
    Dhulipala, Laxman
    Shun, Julian
    [J]. SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 1851 - 1864
  • [49] Uno Takeaki., 2004, FIMI, V126
  • [50] Wang J., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P501, DOI 10.1145/1148170.1148257