An Algorithm of Inductively Identifying Clusters From Attributed Graphs

被引:32
作者
Hu, Lun [1 ]
Yang, Shicheng [2 ]
Luo, Xin [3 ,4 ]
Zhou, MengChu [5 ,6 ]
机构
[1] Chinese Acad Sci, Tech Inst Phys & Chem, Urumqi 830000, Peoples R China
[2] Wuhan Univ Technol, Sch Comp Sci & Technol, Wuhan 430070, Peoples R China
[3] Chinese Acad Sci, Chongqing Inst Green & Intelligent Technol, Chongqing Engn Res Ctr Big Data Applicat Smart Ci, Chongqing Key Lab Big Data & Intelligent Comp, Chongqing 400714, Peoples R China
[4] Hengrui Chongqing Artificial Intelligence Res Ctr, Dept Big Data Analyses Techn, Chongqing 401331, Peoples R China
[5] New Jersey Inst Technol, Dept Elect & Comp Engn, Newark, NJ 07102 USA
[6] Macau Univ Sci & Technol, Inst Syst Engn, Macau 999078, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering algorithms; Big Data; Task analysis; Portfolios; Social networking (online); Signal processing algorithms; Weight measurement; Attributed graph; clustering; classification; PROTEIN COMPLEXES; GENE ONTOLOGY; MODULARITY; PATTERNS;
D O I
10.1109/TBDATA.2020.2964544
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attributed graphs are widely used to represent network data where the attribute information of nodes is available. To address the problem of identifying clusters in attributed graphs, most of existing solutions are developed simply based on certain particular assumptions related to the characteristics of clusters of interest. However, it is yet unknown whether such assumed characteristics are consistent with attributed graphs. To overcome this issue, we innovatively introduce an inductive clustering algorithm that tends to address the clustering problem for attributed graphs without any assumption made on the clusters. To do so, we first process the attribute information to obtain pairwise attribute values that significantly frequently co-occur in adjacent nodes as we believe that they have potential ability to represent the characteristics of a given attributed graph. For two adjacent nodes, their likelihood of being grouped in the same cluster can be weighted by their ability to characterize the graph. Then based on these verifed characteristics instead of assumed ones, a depth-first search strategy is applied to perform the clustering task. Moreover, we are able to classify clusters such that their significance can be indicated. The experimental results demonstrate the performance and usefulness of our algorithm.
引用
收藏
页码:523 / 534
页数:12
相关论文
共 38 条
[1]  
Altaf-Ul-Amin M, 2006, J COMPUT AIDED CHEM, V7, P150
[2]  
[Anonymous], 2000, Graph Clustering by Flow Simulation
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   An automated method for finding molecular complexes in large protein interaction networks [J].
Bader, GD ;
Hogue, CW .
BMC BIOINFORMATICS, 2003, 4 (1)
[5]   On modularity clustering [J].
Brandes, Ulrik ;
Delling, Daniel ;
Gaertler, Marco ;
Goerke, Robert ;
Hoefer, Martin ;
Nikoloski, Zoran ;
Wagner, Dorothea .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (02) :172-188
[6]   The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology [J].
Camon, E ;
Magrane, M ;
Barrell, D ;
Lee, V ;
Dimmer, E ;
Maslen, J ;
Binns, D ;
Harte, N ;
Lopez, R ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D262-D266
[7]   LEARNING SEQUENTIAL PATTERNS FOR PROBABILISTIC INDUCTIVE PREDICTION [J].
CHAN, KCC ;
WONG, AKC ;
CHIU, DKY .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1994, 24 (10) :1532-1547
[8]   Dense Subgraph Extraction with Application to Community Detection [J].
Chen, Jie ;
Saad, Yousef .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (07) :1216-1230
[9]   Overlapping Community Change-Point Detection in an Evolving Network [J].
Cheng, Jiujun ;
Chen, Minjun ;
Zhou, MengChu ;
Gao, Shangce ;
Liu, Chunmei ;
Liu, Cong .
IEEE TRANSACTIONS ON BIG DATA, 2020, 6 (01) :189-200
[10]  
Davis A, 2009, Deep South: A social anthropological study of caste and class