A MORE RELAXED MODEL FOR GRAPH-BASED DATA CLUSTERING: s-PLEX CLUSTER EDITING

被引:29
作者
Guo, Jiong [1 ]
Komusiewicz, Christian [2 ]
Niedermeier, Rolf [2 ]
Uhlmann, Johannes [2 ]
机构
[1] Univ Saarland, D-66123 Saarbrucken, Germany
[2] Univ Jena, Inst Informat, D-07743 Jena, Germany
关键词
NP-hard problems; exact algorithms; fixed-parameter tractability; data reduction; graph modification; k-plex; dense subgraphs; forbidden subgraph characterization; ALGORITHMS;
D O I
10.1137/090767285
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We introduce the s-Plex Cluster Editing problem as a generalization of the well-studied Cluster Editing problem; both are NP-hard and both are motivated by graph-based data clustering. Instead of transforming a given graph by a minimum number of edge modifications into a disjoint union of cliques (this is Cluster Editing), the task in the case of s-Plex Cluster Editing is to transform a graph into a cluster graph consisting of a disjoint union of so-called s-plexes. Herein, an s-plex is a vertex set S inducing a subgraph in which every vertex has degree at least vertical bar S vertical bar - s. Cliques are 1-plexes. The advantage of s-plexes for s >= 2 is that they allow us to model a more relaxed cluster notion (s-plexes instead of cliques), better reflecting inaccuracies of the input data. We develop a provably effective preprocessing based on data reduction (yielding a so-called problem kernel), a forbidden subgraph characterization of s-plex cluster graphs, and a depth-bounded search tree which is used to find optimal edge modification sets. Altogether, this yields efficient algorithms in case of moderate numbers of edge modifications; this is often a reasonable assumption under a maximum parsimony model for data clustering.
引用
收藏
页码:1662 / 1683
页数:22
相关论文
共 37 条
  • [1] BALASUNDARAM B, 2009, OPER RES IN PRESS
  • [2] Correlation clustering
    Bansal, N
    Blum, A
    Chawla, S
    [J]. MACHINE LEARNING, 2004, 56 (1-3) : 89 - 113
  • [3] Bocker S., 2009, ALGORITHMIC IN PRESS
  • [4] Bodlaender HL, 2009, LECT NOTES COMPUT SC, V5917, P17, DOI 10.1007/978-3-642-11269-0_2
  • [5] Going weighted: Parameterized algorithms for cluster editing
    Boecker, S.
    Briesemeister, S.
    Bui, Q. B. A.
    Truss, A.
    [J]. THEORETICAL COMPUTER SCIENCE, 2009, 410 (52) : 5467 - 5480
  • [6] Chen JA, 2010, LECT NOTES COMPUT SC, V6196, P459, DOI 10.1007/978-3-642-14031-0_49
  • [7] Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function
    Chesler, EJ
    Lu, L
    Shou, SM
    Qu, YH
    Gu, J
    Wang, JT
    Hsu, HC
    Mountz, JD
    Baldwin, NE
    Langston, MA
    Threadgill, DW
    Manly, KF
    Williams, RW
    [J]. NATURE GENETICS, 2005, 37 (03) : 233 - 242
  • [8] Generating all maximal induced subgraphs for hereditary and connected-hereditary graph properties
    Cohen, Sara
    Kimelfeld, Benny
    Sagiv, Yehoshua
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2008, 74 (07) : 1147 - 1159
  • [9] Transmission network analysis in tuberculosis contact investigations
    Cook, Victoria J.
    Sun, Sumi J.
    Tapia, Jane
    Muth, Stephen Q.
    Argueello, D. Fermin
    Lewis, Bryan L.
    Rothenberg, Richard B.
    McElroy, Peter D.
    [J]. JOURNAL OF INFECTIOUS DISEASES, 2007, 196 (10) : 1517 - 1527
  • [10] Dehne F, 2006, LECT NOTES COMPUT SC, V4169, P13