InCHlib - interactive cluster heatmap for web applications

被引:45
作者
Skuta, Ctibor [1 ,2 ]
Bartunek, Petr [2 ]
Svozil, Daniel [1 ,2 ]
机构
[1] Prague Inst Chem Technol, Fac Chem Technol, Lab Informat & Chem, CZ-16628 Prague, Czech Republic
[2] Acad Sci Czech Republ, Inst Mol Genet, CZ OPENSCREEN, Vvi, CZ-14220 Prague, Czech Republic
关键词
Data clustering; Cluster heatmap; Scientific visualization; Web integration; Client-side scripting; !text type='Java']Java[!/text]Script library; Big data; Exploration; CANCER GENOMICS BROWSER; MICROARRAY DATA; MOLECULAR DIVERSITY; SCAFFOLD DIVERSITY; NATURAL-PRODUCTS; LEAD DISCOVERY; VISUALIZATION; DRUGS; PLATFORM; TARGETS;
D O I
10.1186/s13321-014-0044-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Background: Hierarchical clustering is an exploratory data analysis method that reveals the groups (clusters) of similar objects. The result of the hierarchical clustering is a tree structure called dendrogram that shows the arrangement of individual clusters. To investigate the row/column hierarchical cluster structure of a data matrix, a visualization tool called 'cluster heatmap' is commonly employed. In the cluster heatmap, the data matrix is displayed as a heatmap, a 2-dimensional array in which the colour of each element corresponds to its value. The rows/columns of the matrix are ordered such that similar rows/columns are near each other. The ordering is given by the dendrogram which is displayed on the side of the heatmap. Results: We developed InCHlib (Interactive Cluster Heatmap Library), a highly interactive and lightweight JavaScript library for cluster heatmap visualization and exploration. InCHlib enables the user to select individual or clustered heatmap rows, to zoom in and out of clusters or to flexibly modify heatmap appearance. The cluster heatmap can be augmented with additional metadata displayed in a different colour scale. In addition, to further enhance the visualization, the cluster heatmap can be interconnected with external data sources or analysis tools. Data clustering and the preparation of the input file for InCHlib is facilitated by the Python utility script inchlib_clust. Conclusions: The cluster heatmap is one of the most popular visualizations of large chemical and biomedical data sets originating, e. g., in high-throughput screening, genomics or transcriptomics experiments. The presented JavaScript library InCHlib is a client-side solution for cluster heatmap exploration. InCHlib can be easily deployed into any modern web application and configured to cooperate with external tools and data sources. Though InCHlib is primarily intended for the analysis of chemical or biological data, it is a versatile tool which application domain is not limited to the life sciences only.
引用
收藏
页数:9
相关论文
共 57 条
[1]   Estrogen receptor alpha in human breast cancer: Occurrence and significance [J].
Ali, S ;
Coombes, RC .
JOURNAL OF MAMMARY GLAND BIOLOGY AND NEOPLASIA, 2000, 5 (03) :271-281
[2]  
[Anonymous], 2010, R LANG ENV STAT COMP
[3]   The properties of known drugs .1. Molecular frameworks [J].
Bemis, GW ;
Murcko, MA .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (15) :2887-2893
[4]   Molecular similarity: a key technique in molecular informatics [J].
Bender, A ;
Glen, RC .
ORGANIC & BIOMOLECULAR CHEMISTRY, 2004, 2 (22) :3204-3218
[5]   Superparamagnetic clustering of data [J].
Blatt, M ;
Wiseman, S ;
Domany, E .
PHYSICAL REVIEW LETTERS, 1996, 76 (18) :3251-3254
[6]  
Deu-Pons J, 2014, BIOINFORMATICS, V30, P2
[7]   Clustering methods and their uses in computational chemistry [J].
Downs, GM ;
Barnard, JM .
REVIEWS IN COMPUTATIONAL CHEMISTRY, VOL 18, 2002, 18 :1-40
[8]  
Dudoit S, 2003, BIOTECHNIQUES, P45
[9]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[10]   geWorkbench: an open source platform for integrative genomics [J].
Floratos, Aris ;
Smith, Kenneth ;
Ji, Zhou ;
Watkinson, John ;
Califano, Andrea .
BIOINFORMATICS, 2010, 26 (14) :1779-1780