SpaceMAP: Visualizing High-dimensional Data by Space Expansion

被引:0
作者
Zu, Xinrui [1 ]
Tao, Qian [1 ]
机构
[1] Delft Univ Technol, Dept Imaging Phys, Delft, Netherlands
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年
关键词
REDUCTION; EIGENMAPS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dimensionality reduction (DR) of high-dimensional data is of theoretical and practical interest in machine learning. However, there exist intriguing, non-intuitive discrepancies between the geometry of high- and low-dimensional space. We look into such discrepancies and propose a novel visualization method called Space-based Manifold Approximation and Projection (SpaceMAP). Our method establishes an analytical transformation on distance metrics between spaces to address the "crowding problem" in DR. With the proposed equivalent extended distance (EED), we are able to match the capacity of high- and low-dimensional space in a principled manner. To handle complex data with different manifold properties, we propose hierarchical manifold approximation to model the similarity function in a data-specific manner. We evaluated SpaceMAP on a range of synthetic and real datasets with varying manifold properties, and demonstrated its excellent performance in comparison with classical and state-of-the-art DR methods. In particular, the concept of space expansion provides a generic framework for understanding nonlinear DR methods including the popular t-distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP).
引用
收藏
页数:17
相关论文
共 53 条
[21]   Multiscale PHATE identifies multimodal signatures of COVID-19 [J].
Kuchroo, Manik ;
Huang, Jessie ;
Wong, Patrick ;
Grenier, Jean-Christophe ;
Shung, Dennis ;
Tong, Alexander ;
Lucas, Carolina ;
Klein, Jon ;
Burkhardt, Daniel B. ;
Gigante, Scott ;
Godavarthi, Abhinav ;
Rieck, Bastian ;
Israelow, Benjamin ;
Simonov, Michael ;
Mao, Tianyang ;
Oh, Ji Eun ;
Silva, Julio ;
Takahashi, Takehiro ;
Odio, Camila D. ;
Casanovas-Massana, Arnau ;
Farhadian, Shelli ;
Dela Cruz, Charles S. ;
Ko, Albert I. ;
Hirn, Matthew J. ;
Wilson, F. Perry ;
Hussin, Julie G. ;
Wolf, Guy ;
Iwasaki, Akiko ;
Krishnaswamy, Smita .
NATURE BIOTECHNOLOGY, 2022, 40 (05) :681-+
[22]  
LeCun Y., 1998, The mnist database of handwritten digits
[23]  
Levina E., 2004, P INT C NEUR INF PRO
[24]   Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data [J].
Linderman, George C. ;
Rachh, Manas ;
Hoskins, Jeremy G. ;
Steinerberger, Stefan ;
Kluger, Yuval .
NATURE METHODS, 2019, 16 (03) :243-+
[25]  
Maaten L., 2009, Artificial intelligence and statistics, P384
[26]  
MacKay D. J., 2005, Comments on "Maximum Likelihood Estimation of Intrinsic Dimension
[27]  
McInnes L, 2020, Arxiv, DOI arXiv:1802.03426
[28]  
Mikolov T., 2013, Advances in neural information processing systems
[29]   Visualizing structure and transitions in high-dimensional biological data [J].
Moon, Kevin R. ;
van Dijk, David ;
Wang, Zheng ;
Gigante, Scott ;
Burkhardt, Daniel B. ;
Chen, William S. ;
Yim, Kristina ;
van den Elzen, Antonia ;
Hirn, Matthew J. ;
Coifman, Ronald R. ;
Ivanova, Natalia B. ;
Wolf, Guy ;
Krishnaswamy, Smita .
NATURE BIOTECHNOLOGY, 2019, 37 (12) :1482-+
[30]  
Nene S. A., 1996, Columbia object image library (COIL-100)