LDSScanner: Exploratory Analysis of Low-Dimensional Structures in High-Dimensional Datasets

被引:59
|
作者
Xia, Jiazhi [1 ]
Ye, Fenjin [1 ]
Chen, Wei [2 ]
Wang, Yusi [1 ]
Chen, Weifeng [3 ]
Ma, Yuxin [2 ]
Tung, Anthony K. H. [4 ]
机构
[1] Cent South Univ, Changsha, Hunan, Peoples R China
[2] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
[3] Zhejiang Univ Finance & Econ, Hangzhou, Zhejiang, Peoples R China
[4] Natl Univ Singapore, Singapore, Singapore
基金
国家自然科学基金重大项目; 美国国家科学基金会;
关键词
High-dimensional data; low-dimensional structure; subspace; manifold; visual exploration; VISUAL EXPLORATION; VISUALIZATION; REDUCTION; METRICS;
D O I
10.1109/TVCG.2017.2744098
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Many approaches for analyzing a high-dimensional dataset assume that the dataset contains specific structures, e.g., clusters in linear subspaces or non-linear manifolds. This yields a trial-and-error process to verify the appropriate model and parameters. This paper contributes an exploratory interface that supports visual identification of low-dimensional structures in a high-dimensional dataset, and facilitates the optimized selection of data models and configurations. Our key idea is to abstract a set of global and local feature descriptors from the neighborhood graph-based representation of the latent low-dimensional structure, such as pairwise geodesic distance (GD) among points and pairwise local tangent space divergence (LTSD) among pointwise local tangent spaces (LTS). We propose a new LTSD-GD view, which is constructed by mapping LTSD and GD to the x axis and y axis using 1D multidimensional scaling, respectively. Unlike traditional dimensionality reduction methods that preserve various kinds of distances among points, the LTSD-GD view presents the distribution of pointwise LTS (x axis) and the variation of LTS in structures (the combination of x axis and y axis). We design and implement a suite of visual tools for navigating and reasoning about intrinsic structures of a high-dimensional dataset. Three case studies verify the effectiveness of our approach.
引用
收藏
页码:236 / 245
页数:10
相关论文
共 50 条
  • [21] Balancing High-Dimensional Datasets with Complex Layers
    Bobrowski, Leon
    24TH INTERNATIONAL CONFERENCE ON ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EAAAI/EANN 2023, 2023, 1826 : 62 - 70
  • [22] CVA file: an index structure for high-dimensional datasets
    Jiyuan An
    Hanxiong Chen
    Kazutaka Furuse
    Nobuo Ohbo
    Knowledge and Information Systems, 2005, 7 : 337 - 357
  • [23] Improved PSO for Feature Selection on High-Dimensional Datasets
    Tran, Binh
    Xue, Bing
    Zhang, Mengjie
    SIMULATED EVOLUTION AND LEARNING (SEAL 2014), 2014, 8886 : 503 - 515
  • [24] CVA file: an index structure for high-dimensional datasets
    An, JY
    Chen, HX
    Furuse, K
    Ohbo, N
    KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 7 (03) : 337 - 357
  • [25] Publishing Private High-dimensional Datasets: A Topological Approach
    Alipourjeddi, Narges
    Miri, Ali
    2022 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2022, : 1142 - 1147
  • [26] CSViz: Class Separability Visualization for high-dimensional datasets
    Marina Cuesta
    Carmen Lancho
    Alberto Fernández-Isabel
    Emilio L. Cano
    Isaac Martín De Diego
    Applied Intelligence, 2024, 54 : 924 - 946
  • [27] Viewpoints: A High-Performance High-Dimensional Exploratory Data Analysis Tool
    Gazis, P. R.
    Levit, C.
    Way, M. J.
    PUBLICATIONS OF THE ASTRONOMICAL SOCIETY OF THE PACIFIC, 2010, 122 (898) : 1518 - 1525
  • [28] A Low-Dimensional Manifold Representative Point Method to Estimate the Non-parametric Density for High-Dimensional Data
    Wang S.
    Li Y.
    Geng J.
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2021, 46 (01): : 65 - 70
  • [29] Lizard Brain: Tackling Locally Low-Dimensional Yet Globally Complex Organization of Multi-Dimensional Datasets
    Bac, Jonathan
    Zinovyev, Andrei
    FRONTIERS IN NEUROROBOTICS, 2020, 13
  • [30] Improved PSO for feature selection on high-dimensional datasets
    Tran, Binh (binh.tran@ecs.vuw.ac.nz), 1600, Springer Verlag (8886): : 503 - 515