Rate-Distortion Theory for Clustering in the Perceptual Space

被引:3
作者
Bardera, Anton [1 ]
Bramon, Roger [1 ]
Ruiz, Marc [1 ]
Boada, Imma [1 ]
机构
[1] Univ Girona, Graph & Imaging Lab, Girona 17003, Spain
关键词
information visualization; rate-distortion theory; clustering; information theory; VISUALIZATION DESIGN; ALGORITHMS; TAXONOMY; CAPACITY; CHANNEL; SYSTEM; TREES;
D O I
10.3390/e19090438
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
How to extract relevant information from large data sets has become a main challenge in data visualization. Clustering techniques that classify data into groups according to similarity metrics are a suitable strategy to tackle this problem. Generally, these techniques are applied in the data space as an independent step previous to visualization. In this paper, we propose clustering on the perceptual space by maximizing the mutual information between the original data and the final visualization. With this purpose, we present a new information-theoretic framework based on the rate-distortion theory that allows us to achieve a maximally compressed data with a minimal signal distortion. Using this framework, we propose a methodology to design a visualization process that minimizes the information loss during the clustering process. Three application examples of the proposed methodology in different visualization techniques such as scatterplot, parallel coordinates, and summary trees are presented.
引用
收藏
页数:17
相关论文
共 49 条
[31]  
International Commission on Illumination, 1976, COL L A B COL SPAC
[32]  
Jain A.K., 1981, ALGORITHMS CLUSTERIN
[33]   Cluster analysis for gene expression data: A survey [J].
Jiang, DX ;
Tang, C ;
Zhang, AD .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (11) :1370-1386
[34]   Maximum Entropy Summary Trees [J].
Karloff, Howard ;
Shirley, Kenneth E. .
COMPUTER GRAPHICS FORUM, 2013, 32 (03) :71-80
[35]   An Algebraic Process for Visualization Design [J].
Kindlmann, Gordon ;
Scheidegger, Carlos .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2014, 20 (12) :2181-2190
[36]   MarketAnalyzer: An Interactive Visual Analytics System for Analyzing Competitive Advantage Using Point of Sale Data [J].
Ko, S. ;
Maciejewski, R. ;
Jang, Y. ;
Ebert, D. S. .
COMPUTER GRAPHICS FORUM, 2012, 31 (03) :1245-1254
[37]   Density-based clustering [J].
Kriegel, Hans-Peter ;
Kroeger, Peer ;
Sander, Joerg ;
Zimek, Arthur .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (03) :231-240
[38]   XCluSim: a visual analytics tool for interactively comparing multiple clustering results of bioinformatics data [J].
L'Yi, Sehi ;
Ko, Bongkyung ;
Shin, DongHwa ;
Cho, Young-Joon ;
Lee, Jaeyong ;
Kim, Bohyoung ;
Seo, Jinwook .
BMC BIOINFORMATICS, 2015, 16
[39]   Comparative Analysis of Multidimensional, Quantitative Data [J].
Lex, Alexander ;
Streit, Marc ;
Partl, Christian ;
Kashofer, Karl ;
Schmalstieg, Dieter .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2010, 16 (06) :1027-1035
[40]  
Lima M., 2014, The Book of Trees: Visualizing Branches of Knowledge