A Survey of Co-Clustering

被引:0
|
作者
Wang, Hongjun [1 ,2 ]
Song, Yi [1 ]
Chen, Wei [1 ]
Luo, Zhipeng [1 ]
Li, Chongshou [1 ]
Li, Tianrui [1 ]
机构
[1] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu, Peoples R China
[2] Minist Educ, Engn Res Ctr Sustainable Urban Intelligent Transpo, Chengdu, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
co-clustering; information theory; graph theory; matrix factorization; NONNEGATIVE MATRIX FACTORIZATION; TRI-FACTORIZATION; MICROARRAY DATA; ALGORITHM; SPARSE; MODEL; NETWORK;
D O I
10.1145/3681793
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Co-clustering is to cluster samples and features simultaneously, which can also reveal the relationship between row clusters and column clusters. Therefore, lots of scientists have drawn much attention to conduct extensive research on it, and co-clustering is widely used in recommendation systems, gene analysis, medical data analysis, natural language processing, image analysis, and social network analysis. In this article, we survey the entire research aspect of co-clustering, especially the latest advances in co-clustering, and discover the current research challenges and future directions. First, due to different views from researchers on the definition of co-clustering, this article summarizes the definition of co-clustering and its extended definitions, as well as related issues, based on the perspectives of various scientists. Second, existing co-clustering techniques are approximately categorized into four classes: information-theory-based, graph-theory-based, matrix-factorization-based, and other theories-based. Third, co-clustering is applied in various aspects such as recommendation systems, medical data analysis, natural language processing, image analysis, and social network analysis. Furthermore, 10 popular co-clustering algorithms are empirically studied on 10 benchmark datasets with 4 metrics-accuracy, purity, block discriminant index, and running time, and their results are objectively reported. Finally, future work is provided to get insights into the research challenges of co-clustering.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] Joint co-clustering: Co-clustering of genomic and clinical bioimaging data
    Ficarra, Elisa
    De Micheli, Giovanni
    Yoon, Sungroh
    Benini, Luca
    Macii, Enrico
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2008, 55 (05) : 938 - 949
  • [2] Bayesian Co-clustering
    Shan, Hanhuai
    Banerjee, Arindam
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 530 - 539
  • [3] Co-Clustering on Manifolds
    Gu, Quanquan
    Zhou, Jie
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 359 - 367
  • [4] Directional co-clustering
    Aghiles Salah
    Mohamed Nadif
    Advances in Data Analysis and Classification, 2019, 13 : 591 - 620
  • [5] Directional co-clustering
    Salah, Aghiles
    Nadif, Mohamed
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (03) : 591 - 620
  • [6] Bayesian co-clustering
    Domeniconi, Carlotta
    Laskey, Kathryn
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2015, 7 (05) : 347 - 356
  • [7] Spectral co-clustering ensemble
    Huang, Shudong
    Wang, Hongjun
    Li, Dingcheng
    Yang, Yan
    Li, Tianrui
    KNOWLEDGE-BASED SYSTEMS, 2015, 84 : 46 - 55
  • [8] Latent Dirichlet co-clustering
    Shafiei, M. Mahdi
    Milios, Evangelos E.
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 542 - +
  • [9] Evolutionary Spectral Co-Clustering
    Green, Nathan
    Rege, Manjeet
    Liu, Xumin
    Bailey, Reynold
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 1074 - 1081
  • [10] Co-clustering for Microdata Anonymization
    Benkhelif, Tarek
    Fessant, Francoise
    Clerot, Fabrice
    Raschia, Guillaume
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2017, PT I, 2017, 10438 : 343 - 351