Tree preserving embedding

被引:14
作者
Shieh, Albert D. [1 ]
Hashimoto, Tatsunori B. [1 ]
Airoldi, Edoardo M. [1 ]
机构
[1] Harvard Univ, Dept Stat, Cambridge, MA 02138 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
hierarchical clustering; multidimensional scaling; NONLINEAR DIMENSIONALITY REDUCTION; DENSITY;
D O I
10.1073/pnas.1018393108
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The goal of dimensionality reduction is to embed high-dimensional data in a low-dimensional space while preserving structure in the data relevant to exploratory data analysis such as clusters. However, existing dimensionality reduction methods often either fail to separate clusters due to the crowding problem or can only separate clusters at a single resolution. We develop a new approach to dimensionality reduction: tree preserving embedding. Our approach uses the topological notion of connectedness to separate clusters at all resolutions. We provide a formal guarantee of cluster separation for our approach that holds for finite samples. Our approach requires no parameters and can handle general types of data, making it easy to use in practice and suggesting new strategies for robust data visualization.
引用
收藏
页码:16916 / 16921
页数:6
相关论文
共 41 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
[Anonymous], 1978, Multidimensional scaling
[3]  
[Anonymous], 2002, Principal Component Analysis
[4]  
[Anonymous], 2003, Advances in Neural Informaiton Processing Systems
[5]   Solving the protein sequence metric problem [J].
Atchley, WR ;
Zhao, JP ;
Fernandes, AD ;
Drüke, T .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (18) :6395-6400
[6]  
Battista GD., 1998, Graph drawing: algorithms for the visualization of graphs
[7]   Laplacian eigenmaps for dimensionality reduction and data representation [J].
Belkin, M ;
Niyogi, P .
NEURAL COMPUTATION, 2003, 15 (06) :1373-1396
[8]  
Borg I., 2005, MODERN MULTIDIMENSIO, DOI DOI 10.18637/JSS.V014.B04
[9]   Data visualization with multidimensional scaling [J].
Buja, Andreas ;
Swayne, Deborah F. ;
Littman, Michael L. ;
Dean, Nathaniel ;
Hofmann, Heike ;
Chen, Lisha .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2008, 17 (02) :444-472
[10]  
Carlsson G, 2010, J MACH LEARN RES, V11, P1425