Optimized leaf ordering with class labels for hierarchical clustering

被引:3
|
作者
Novoselova, Natalia [1 ]
Wang, Junxi [2 ]
Klawonn, Frank [2 ,3 ]
机构
[1] United Inst Informat Problems, Dept Bioinformat, Surganova Str 6, Minsk 220012, BELARUS
[2] Helmholtz Ctr Infect Res, Biostat, D-38124 Braunschweig, Germany
[3] Ostfalia Univ Appl Sci, Dept Comp Sci, D-38302 Wolfenbuttel, Germany
关键词
Hierarchical clustering; dendrogram; leaf ordering; dynamic programming; biomedical data;
D O I
10.1142/S0219720015500122
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Hierarchical clustering is extensively used in the bioinformatics community to analyze biomedical data. These data are often tagged with class labels, as e.g. disease subtypes or gene ontology (GO) terms. Heatmaps in connection with dendrograms are the common standard to visualize results of hierarchical clustering. The heatmap can be enriched by an additional color bar at the side, indicating for each instance in the data set to which class it belongs. In the ideal case, when the clustering matches perfectly with the classes, one would expect that instances from the same class cluster together and the color bar consists of well-separated color blocks without frequent alteration of colors (classes). But even in the case when instances from the same class cluster perfectly together, the dendrogram might not reflect this important aspect due to the fact that its representation is not unique. In this paper, we propose a leaf ordering algorithm for the dendrogram that preserving the hierarchical clustering result tries to group instances from the same class together. It is based on the concept of dynamic programming which can efficiently compute the optimal or nearly optimal order, consistent with the structure of the tree.
引用
收藏
页数:19
相关论文
共 50 条
  • [11] HIER: Metric Learning Beyond Class Labels via Hierarchical Regularization
    Kim, Sungyeon
    Jeong, Boseung
    Kwak, Suha
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19903 - 19912
  • [12] CLUSTERING, CODING, SWITCHING, HIERARCHICAL ORDERING, AND CONTROL IN A NETWORK OF CHAOTIC ELEMENTS
    KANEKO, K
    PHYSICA D, 1990, 41 (02): : 137 - 172
  • [13] A hierarchical laplacian TWSVM using similarity clustering for leaf classification
    Goyal, Neha
    Gupta, Kapil
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2022, 25 (02): : 1541 - 1560
  • [14] A hierarchical laplacian TWSVM using similarity clustering for leaf classification
    Neha Goyal
    Kapil Gupta
    Cluster Computing, 2022, 25 : 1541 - 1560
  • [15] A hierarchical laplacian TWSVM using similarity clustering for leaf classification
    Goyal, Neha
    Gupta, Kapil
    Cluster Computing, 2022, 25 (02) : 1541 - 1560
  • [16] Energy aware optimized clustering for hierarchical routing in wireless sensor network
    Yadav, Rakesh Kumar
    Mahapatra, R. P.
    COMPUTER SCIENCE REVIEW, 2021, 41
  • [17] K-ary clustering with optimal leaf ordering for gene expression data
    Bar-Joseph, Z
    Demaine, ED
    Gifford, DK
    Hamel, AM
    Jaakkola, TS
    Srebro, N
    ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS, 2002, 2452 : 506 - 520
  • [18] K-ary clustering with optimal leaf ordering for gene expression data
    Bar-Joseph, Z
    Demaine, ED
    Gifford, DK
    Srebro, N
    Hamel, AM
    Jaakkola, TS
    BIOINFORMATICS, 2003, 19 (09) : 1070 - 1078
  • [19] Class Decomposition using K-means and Hierarchical Clustering
    Banitaan, Shadi
    Nassif, Ali Bou
    Azzeh, Mohammad
    2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 1263 - 1267
  • [20] Clustering-Based Hierarchical Framework for Multiclass Classification of Leaf Images
    Goyal, Neha
    Gupta, Kapil
    Kumar, Nitin
    IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS, 2022, 58 (03) : 4076 - 4085